On January 25th, an application code deployment resulted in health check failures which impacted our computing ability to service other API requests properly.
On 2023-01-25 22:41 UTC , an application code deployment removed application health endpoints that were used for critical infrastructure monitoring purposes. As a result, otherwise healthy instances were falsely marked as requiring termination, and new replacement instances were created. Beginning at 2023-01-25 23:09 UTC there were at times too few instances available to serve the volume of traffic, which impacted our ability to properly service API requests. Spreedly reverted the problematic code and instantiated new instances, which restored service levels for all customers.
Spreedly is committed to holistically reviewing and reimagining our change release processes and culture, with a focus on better documentation and cross-team training on cross-API interface requirements, deployment monitoring, and increased automatic paging for actionable alerts.
We deeply apologize to our customers for this interruption to service and the impacts on their business which they have entrusted to Spreedly.