Service Outage

Incident Report for Allotrac Status Page

Resolved

Incident Resolved.

Posted Oct 31, 2024 - 17:45 AEDT

Update

Today's outage occurred within minutes of Monday's incident, indicating a potential correlation. Our ongoing investigation has uncovered a significant surge in traffic during these incidents, which led to one of the five database servers failing and triggering a domino effect.
To ensure data integrity, we promptly initiated the restoration process, and no data loss has occurred. Further analysis is underway to prevent future disruptions and address the root cause of these traffic spikes.

At 12:10 PM the primary database cluster for Allotrac experienced an issue that caused server number 3 to crash. The cluster automatically detected that a server had crashed in a manner that could lead to data sync issues and safely disabled access to the database. This caused affected customers to lose access to Allotrac while the issue was addressed.

By 12:13 PM the Allotrac DevOps team had verified the data integrity of the remaining database cluster servers and selected server 0 to initiate an automated restoration. Due to the fact that only one server remained in-sync, the recovery process was more involved than the Monday recovery.

At 2:25 PM the automated recovery of the cluster for server 1 was completed and the Allotrac DevOps team began restoring access to affected customers.

At 3:45 PM access was safely restored to all customers

Work is still underway restoring the prior number of database servers to ensure no degradation in performance.

Posted Oct 31, 2024 - 16:14 AEDT

Update

Expect periods of slowness whilst additional database servers come online

Posted Oct 31, 2024 - 15:33 AEDT

Monitoring

Service restored, monitoring performance

Posted Oct 31, 2024 - 15:16 AEDT

Update

Restoring Service

Posted Oct 31, 2024 - 14:26 AEDT

Update

Shortly commencing service restoration

Posted Oct 31, 2024 - 14:05 AEDT

Identified

The issue has been identified. We have initiated recovery

Posted Oct 31, 2024 - 12:33 AEDT

Investigating

We are currently investigating the issue

Posted Oct 31, 2024 - 12:21 AEDT

This incident affected: Web App and Database.