Elevated error rates across all services
Incident Report for OpenAI
Postmortem

On Wednesday May 12 2023, from approximately 11:45am to 12:10pm GPT-3.5, GPT-3.5 Turbo and GPT-4 models were unavailable due to an incorrect deployment to our safety classifier configuration. Most customers started experiencing errors at 11:55am. After we detected an elevated error rate, we quickly rolled back the configuration change which restored service.

We have since fixed our tooling to catch these errors before they hit production. Also, we have added increased alerting to detect these errors more quickly to enable us to roll back faster. Lastly, we have a project already in progress to incrementally deploy changes, so that we can detect and revert errors while only running on a small percentage of traffic. That project will be operational within the quarter.

Posted May 11, 2023 - 17:03 PDT

Resolved
This incident has been resolved.
Posted May 10, 2023 - 12:14 PDT
Monitoring
The issue has been identified and we are currently investigating.
Posted May 10, 2023 - 12:12 PDT
Investigating
We are currently investigating an issue affecting multiple services, including chat.openai.com, labs.openai.com, and API services.
Posted May 10, 2023 - 11:59 PDT
This incident affected: API, ChatGPT, and Labs.