Elevated error rate while creating fine-tuning jobs
Incident Report for OpenAI
Resolved
Fine-tuning service is running normally.
Posted Jul 31, 2024 - 14:16 PDT
Update
Error rates have stabilized and Fine-tuning API endpoints are operating normally again. Jobs are being processed as expected. We will continue to monitor this service closely for elevated error rates.
Posted Jul 31, 2024 - 12:45 PDT
Update
We are experiencing a temporary increase in error rates from the /v1/fine_tuning API, including job creation and listing and event listing. We expect these errors to subside by 12:45pm PT.
Posted Jul 31, 2024 - 12:35 PDT
Monitoring
Fine-tuning jobs are being processed again, though the service is still experiencing elevated error rates.
Posted Jul 31, 2024 - 11:19 PDT
Identified
The Fine-tuning API is currently online and accepting job creation requests, but there is a delay in job processing. Jobs will remained queued for the time being.
Posted Jul 31, 2024 - 10:57 PDT
Investigating
Another spike of errors just occurred preventing jobs from being created. A mitigation is being pushed.
Posted Jul 31, 2024 - 10:49 PDT
Monitoring
A fix has been deployed and the service is operating normally. We are continuing to monitor this service for errors
Posted Jul 31, 2024 - 09:12 PDT
Investigating
We are experiencing elevated error rates (500 responses) on the POST /v1/fine_tuning/jobs endpoint.
Posted Jul 31, 2024 - 09:09 PDT
This incident affected: API.