At ClearlyIP, we strive to be transparent with all our customers. Unfortunately, we had an extended outage incident on August 9, 2022, which affected customers on Clearly Cloud Canada (clusterpbx.com). Below is an update as to the cause of the outage, how we addressed it, and how we plan to prevent it moving forward. This does not excuse the incident, but is intended to help understand what happened and how we intend to prevent any future issues related.
Subject: Clearly Cloud Canada Issue on August 9, 2022
Resolved Time: 09:32 am Central Time
Issue: Customers on Clearly Cloud Canada had an extended period where they were unable to register, in addition to not being able to make or receive calls
Details: On August 9, 2022, our Network Operations team was alerted of potential issues with Clearly Cloud Canada at 9:32 am Central Time. They immediately began reviewing the alerts to determine what was going on. At this time the team saw some registrations working normally for some clients but determined that registrations, as well as inbound and outbound calling, were affected.
Our engineers were notified and immediately started to investigate the cause of the outage. In reviewing the data, our engineers saw registration attempts trying to occur, but the system was not sending responses as it should have been. The team determined that the issue was related to a logging system being overloaded and backlogged. This caused a delay in the registration, which would cause the device to time out and try again, further affecting the backlog.
As a result of this, the team has been working on implementing a fix for the related issues. We are adjusting the registration process to eliminate a backup such as this, and as part of the fix, we have reviewed our data retention strategy on registration logs, and we will be reducing the amount of registration log data we allow to be stored, which contributed to this issue.