Resolved
We have been stable since the previous message. Marking the incident as resolved.
Monitoring
Performance appears to have mostly recovered at this point. We are continuing to monitor the situation.
Identified
Our deployment is progressing & models are continuing to recover. We are no longer seeing errors as frequently and latencies are dropping across the board. There is still degraded performance until the deployment completes.
Identified
We are rolling out the fix and our models are in the process of recovering.
Investigating
The fix we identified seems promising. Latency has been restored to the model we tested it on (text-babbage-001). We are now rolling it out more broadly.
Investigating
We are continuing to investigate this issue. We also have observed that other models are experiencing increased latencies as well, though not to the point of failures.
We have a candidate fix that we are trying out now.
Investigating
We began experienced intermittent failures in text-davinci-002 due to load beginning at 1:25 pm.
We are actively rearranging our capacity to allow these engines to recover.