Server maintenance performed during Spreedly’s regular change window to increase capacity to an internal service caused two brief periods of service interruption to APIs that access decrypted information.
At approximately 19:15 UTC, Spreedly Engineering initiated a capacity expansion of an internal service used for data decryption. From 19:28 to 19:30, a first wave of failed API requests during the maintenance window. A second wave of failed API requests occurred from 19:39 to 19:41 UTC when rebalancing request traffic to the capacity-expanded internal service. Service was restored at 19:41 UTC, and the system was fully operational.
API calls that required decrypted data were impacted during the outage timeframes.
Spreedly Engineering is improving internal observability, implementing automated monitors, and investigating the use of automation for scaling capacity in the future to prevent this issue from recurring.