Realex Transactions Failing
Incident Report for Spreedly
Postmortem

Between 19:30 UTC July 30 and 12:36 UTC July 31, all transactions against the Realex gateway failed.

What Happened

At approximately 19:30 UTC July 30, we applied an update to implement general credit transactions for the Realex gateway. This change involved adding a new gateway level field for Realex. For transactions using pre-existing Realex gateway tokens, we attempted to send an empty value in this field to the gateway. This empty value caused an error before the connection to the gateway was opened, causing all transactions on the Realex gateway to fail. We reverted the code change at 12:36 UTC July 31 for a total downtime duration of 17 hours and 23 minutes.

While our testing accounted for sending this new gateway field as part of newly created gateway tokens, our tests did not account for existing gateway tokens with an empty gateway level field.

Next Steps

We are taking the following steps to help prevent issues like this in the future:

  1. Evaluating more test cases before code deployment to ensure as many scenarios are covered as possible
  2. Enhancing our internal monitoring infrastructure to identify issues like this sooner

Conclusion

We apologize for any disruption this incident may have caused. You rely on Spreedly to report transaction status accurately, and we are taking steps to ensure that you can conduct your business confidently.

Posted 17 days ago. Aug 06, 2019 - 12:25 EDT

Resolved
On July 30, 2019, 19:13 UTC, Spreedly deployed a change to the Realex gateway. The change caused all subsequent Realex transactions to fail.

The change was reverted on July 31, 2019 12:36 UTC.

We have confirmed that Realex transactions are now processing successfully.

Any failures encountered during the outage are safe to retry as the transactions failed before contacting the gateway.
Posted 23 days ago. Jul 31, 2019 - 08:48 EDT
Investigating
We are currently investigating an issue with the Realex gateway causing all transactions to fail.
Posted 23 days ago. Jul 31, 2019 - 08:20 EDT
This incident affected: Core Transactional API.