Spreedly logo
  • Operational
  • Degraded Performance
  • Partial Outage
  • Major Outage
  • Maintenance
Core requests resulting in some 500 responses
Incident Report for Spreedly
Postmortem

April 27th, 2023 — Temporary resource contention in our environment resulted in 500 response errors
A code merge inadvertently prompted creation of a large queue which resulted in transitory transaction errors.

What Happened
A code merge inadvertently impacted batch processing services which created a large queue and which put pressure on a critical path database service. This resulted in some transitory errors for transactions, the majority of which were re-tried successfully.After the issue was resolved by restarting the processing service and the queues returned to normal, we refined our queue monitoring process to include additional monitoring conditions.

Next Steps
Work has been started to optimize our database performance in order to prevent similar issues in the future.

Conclusion
We apologize for this disruption to service, and will continue to drive internal improvements to avoid similar impact in the future.

Posted Jun 22, 2023 - 12:09 EDT

Resolved
After closely monitoring our resource utilization and confirming that all systems are stabilized and functioning as expected, this incident is considered resolved. No further customer impact is expected in association with this incident.

We are completing our investigation with regards to the causes of the incident and any residual impact. A post incident review will be published.

We apologize for any inconvenience and disruption to service this caused for impacted customers.
Posted Apr 27, 2023 - 10:20 EDT
Update
While the previously experienced transaction errors are resolved, we are still monitoring our resource utilization closely until our system is confirmed stable. Customers may still experience some delays to services such as Dashboard reporting and callbacks.

We will provide an update within 24 hours.
Posted Apr 26, 2023 - 17:06 EDT
Monitoring
We identified this issue was caused by a temporary resource contention in our environment and we have addressed it. We are monitoring to ensure this has fully resolved the issue.
Posted Apr 26, 2023 - 14:20 EDT
Update
We are actively investigating an issue which resulted in 500 errors responses being returned from Core. We have identified some customers have been impacted. We will update with additional information as this investigation progresses.
Posted Apr 26, 2023 - 14:05 EDT
Investigating
Spreedly has detected an issue that may impact customers, including potentially resulting in failed transactions. While we are providing an early notification in an effort to alert as quickly as possible, we are still investigating the actual scope and impact and will provide an update as soon as more details are available.
Posted Apr 26, 2023 - 13:50 EDT
This incident affected: Core Transactional API.