Intermittent calling issues with Clearly Cloud USA

Incident Report for Clearly IP

Postmortem

At approximately 6:18 PM ET, an internal alarm detected a component restart within the Clearly Cloud US platform. The engineering team quickly determined that the component successfully failed over to a redundant system while restarting, but this resulted in a few moments of disruption for dialing and status notifications (BLF, MWI, etc.) The issue was immediately escalated to key members of the infrastructure and software groups who were conferenced in to analyze the situation. Approximately ten minutes later, another restart (and failover) occurred, causing another brief disruption to new calls and status updates.

The team identified the issue to be triggered by data integrity problems with several contact records. Research confirmed that this issue had never been encountered before, occurring while a routine system task purges expired contact records every 10 minutes. Several solutions were explored, and ultimately a repair of the affected data table was performed at approximately 7:40 PM ET, resolving the underlying problem. Additional cleanup tasks were implemented at approximately 8:00 PM as the escalation team monitored the platform. After recording several task cycles without further incident, the team reviewed potential improvements and documented the incident before marking it resolved at approximately 8:15 PM ET.

While every part of Clearly Cloud has redundancies and most of these fail over gracefully without impacting operations, the component in question requires a few moments for backup systems to take over. During this timeframe--between a few seconds and a minute--established calls would be unaffected, but new calls may fail to complete and status messages for BLF, MWI, parking, and other notifications may fail to be sent. Although the issue's impact was limited due to the time of day, multiple brief failovers in a short timeframe caused disruption, which was noticed by several users

ClearlyIP's engineering team will incorporate what was learned about this incident within upcoming platform improvements.

Posted Aug 14, 2025 - 21:07 CDT

Resolved

This incident has been resolved.
Posted Aug 13, 2025 - 20:51 CDT

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Aug 13, 2025 - 19:51 CDT

Identified

The issue has been identified and a fix is being implemented.
Posted Aug 13, 2025 - 19:17 CDT

Update

The team is investigating an issue that is causing inbound and outbound calls to fail occasionally. BLF lights are also affected by this.
Posted Aug 13, 2025 - 18:49 CDT

Investigating

We are currently investigating this issue.
Posted Aug 13, 2025 - 17:20 CDT
This incident affected: CloudPBX (Clearly Cloud USA).