Spreedly logo
  • Operational
  • Degraded Performance
  • Partial Outage
  • Major Outage
  • Maintenance
Intermittent Core 500 Errors
Incident Report for Spreedly
Postmortem

January 5, 2023 — Intermittent 500s in Core

January 5, 2023 — Intermittent Core 500 Errors Primarily Affecting Offsite Transactions

Spreedly’s core API server receives and responds to external API requests made by our clients.

What Happened

While upgrading Spreedly’s database capabilities, code was deployed that generated a large queue of backlogged work. While working through the backlog of enqueued work, core served some customers intermittent 500 errors on January 5, 2023 between 5:30pm UTC and 8:00pm UTC.

Next Steps

Spreedly is reviewing database upgrade processes, queueing rules, and adding some additional alerting.

Posted Jan 12, 2023 - 11:08 EST

Resolved
Callback delivery may continue to experience delays, however the queue is now rapidly decreasing.

We estimate the queue to be caught up and current overnight. Any delayed or missed callbacks will be re-queued tomorrow.

The incident is being considered resolved.

We are still investigating to understand the specific causes of the incident and any residual impact. A post incident review will be published.

We apologize for any inconvenience and disruption to service.
Posted Jan 05, 2023 - 22:52 EST
Monitoring
The issue has been mitigated that caused intermittent 500 errors on Spreedly's Core API and affected asynchronous transactions. Customers are no longer experiencing 500 errors.

Customers may continue to see a delay in callback delivery as the queue is consumed. We continue to actively monitor the queue consumption. This issue is now in monitoring.
Posted Jan 05, 2023 - 15:58 EST
Investigating
We are investigating an issue regarding intermittent core 500 errors impacting less than 1% of transactions. We apologize for the inconvenience as our teams work to resolve this issue.
Posted Jan 05, 2023 - 13:39 EST
This incident affected: Core Transactional API.