A vendor service in the critical path for some Core transactions had an unrelated failure during maintenance which resulted in errors to customers.
One of our external vendors regularly performs routine maintenance on databases in order to maintain security and reliability. While performing such a maintenance, on April 6th, 21:37 UTC the vendor automatically migrated our application from one database copy to another (a standard operation). This failover coincided with an incident in their application engine, which prevented our application from restarting with the new database. As a result, a critical secondary service was unavailable and resulted in customer errors from 21:41 to 22:05 for some classes of transactions (such as tokenizing new cards). Once the vendor resolved their application engine issue, our application successfully started and normal operation resumed.
We are in the process of moving this critical secondary service databases to a new hosting provider, which will provide more control over database scale and maintenance windows.
We apologize for this disruption to service, and will continue to drive internal resiliency and availability initiatives to reduce the impact of 3rd party outages.