CS - EU: Investigating Elevated Error Rates
Incident Report for Gainsight
Postmortem

Incident:
An isolated number of customers experienced degraded performance in CS-EU Rules on the 3rd of January, 2024. This could have also intermittently impacted the ability to log into the application.

Root Cause:
This incident was result of an elevated number of API requests coming from a single microservice. The unexpected increase led to a build-up of connections, impacting performance on a subset of API servers. Rate limiting functionality was not configured as expected in this case.

Recovery Action:
Once the affected systems and related traffic were identified, Isolating and restarting effected API services resolved the issue immediately.

Preventive Measures:
We have corrected the rate limiter functionality for the microservice that caused this issue.

Posted Feb 23, 2024 - 05:44 UTC

Resolved
This incident has been resolved.
Posted Jan 03, 2024 - 15:14 UTC
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Jan 03, 2024 - 14:50 UTC
Investigating
We are investigating a sudden increase in error rates which may lead to degraded performance or service interruption.
Posted Jan 03, 2024 - 14:34 UTC
This incident affected: Gainsight CS - EU Region (Gainsight CS EU Application).