An anomaly caused service failure for a portion of underlying API systems which allowed the affected nodes to report availability while failing to accept traffic. This rare and unfortunate situation caused additional load for nodes still accepting traffic, and auto-scaling did not trigger as designed to keep up with load.
Engineers were able to identify and correct the issue quickly while ensuring that resources were available to meet demand.
Adjustments to service monitors and auto-scaling configurations were made to prevent this scenario moving forward. We take availability and performance seriously and will continue to make improvements for the best possible experience.