From March 16th at 10:04 CDT through March 17th at 09:43 CDT, some users without an active session experienced intermittent authentication issues where a new attempt to log in to their desktop or mobile client would hang. Additionally, some operations in the PBX settings of the Secure Portal would fail with an unexpected error. Although these errors were intermittent, they were occurring more consistently at the beginning of the day, when new login activity was higher. There was no impact to voice services or call routing during this incident, and users with a persistent session in their client from a previous login would not have noticed any issues.
The root cause of this incident was related to an underlying issue within Cytracom Desktop where an older version of the client incorrectly handled the server response when an expired authentication token was presented to our backend services. Due to this, the amount of network traffic from those clients that had not yet updated was significantly elevated, which eventually overwhelmed the services that support authentication.
On March 16th at 09:43 CDT, impacted authentication services were recovered, rate limiting was implemented to reduce the impact of clients that had not yet updated to the version where the defect was fixed, and no further unexpected login failures were observed.
In addition to the immediate resolution, we have implemented additional infrastructure alerting around our authentication services and will be pursuing a review of our monitoring to detect issues of a similar nature in the future.