On December 22nd at 12:10PM UTC a major disruption in service at Spreedly’s cloud hosting provider resulted in a period of increased errors for customers attempting to use the Spreedly API.
The Spreedly API was returning error codes for a period of approximately 1 hour.
Normal operations resumed once the issue with the cloud provider’s physical infrastructure was resolved.
Spreedly services are hosted and run on a common public cloud computing platform. A significant issue at this provider resulted in core Spreedly services being unavailable, resulting in customer requests failing and receiving a 503 or 502 error response code in most cases. Spreedly engineers were engaged and began undertaking steps to remediate the issue. However, during this time the cloud provider had begun to resolve the issue on their end and Spreedly services became available again once that issue was resolved.
Internal monitoring and alerting systems will be updated in order to be able to more quickly detect and react to upstream service issues in the future. In addition, Spreedly engineers are investigating system architectural changes with the goal to improve system resiliency during external issues like this.