On 9/16 at approximately 1:50 PM CDT, ClearlyIP began to receive indications of connectivity problems affecting the Pulsar360 Denver data center. Infrastructure and network teams were immediately notified and began troubleshooting. The team quickly ruled out internal infrastructure issues, confirming that links to all Pulsar360 equipment were working normally. Within 15 minutes, the team had identified the source of the issue as a regional Internet routing problem caused by actions of an unrelated third-party fiber provider.
The issue originated within a route (BGP) advertisement that erroneously included large sections of IP addresses--among them addresses used by Pulsar360's Denver location. That bad route information caused many ISPs to misdirect traffic away from the Denver data center location.
ClearlyIP's engineering team pursued parallel efforts to resolve the underlying issue and explore workarounds that could be executed more rapidly. While contact was still being made with the third-party company through upstream contacts, a solution involving forced advertisement of smaller network segments (counteracting the large mis-advertisement) was tested successfully. The approach was approved quickly and the team immediately began executing it for all of the Pulsar360 Denver network segments during the next 30 minutes, restoring connectivity.
Stable connectivity had been achieved across the entire Denver network approximately one hour after the incident had been reported.