OpenAI

Write-up
Increased login and signup errors

Incident Summary

On August 28, 2025, from approximately 6:55 AM to 7:59 AM PDT, OpenAI experienced elevated error rates across several authentication endpoints. This resulted in a significant number of failed login and signup attempts, as well as disruptions for existing users attempting to refresh their session tokens. Engineers restored service by shifting traffic and extending token lifetimes. We’re implementing several reliability improvements to minimize impact and prevent similar issues. 

Root Cause

The issue was caused by degraded performance in a database cluster that supports our authentication systems. This led to increased latency and failures for critical token exchange operations, including both initial logins and token refreshes.

Impact

  • Users attempting to log in or sign up during this period experienced approximately a 69% failure rate.

  • A small fraction of already logged-in users may have been logged out due to token refresh failures.

Remediation

To restore service, our engineering teams implemented multiple mitigations:

  • Traffic was shifted away from the affected database cluster.

  • A failover to a secondary region was initiated.

  • We extended the time-to-live (TTL) for access tokens to reduce the frequency of refresh attempts.

These actions brought the system back to a healthy state by 7:59 AM PDT on August 28.

Preventive Actions

To prevent similar disruptions in the future, we are:

  • Enhancing storage backend reliability from multiple angles.

  • Working to ensure that transient backend failures do not lead to user logouts if access tokens are still valid.

  • Investigating ways to reduce unnecessary token refreshes via client-side improvements.

We apologize for the disruption and appreciate your patience while we work to strengthen the reliability of our authentication systems.

Powered by

Availability metrics are reported at an aggregate level across all tiers, models and error types. Individual customer availability may vary depending on their subscription tier as well as the specific model and API features in use.