OpenAI

Outage on some Ada and Babbage models
Affected components
Updates

Write-up published

Read it here

Resolved

Summary:

On July 6, 2022 22:17 PDT \(July 7 05:17 UTC\), the following models had partial or complete outages:

  • text-similarity-ada-001

  • text-search-babbage-doc-001

  • content-filter-alpha

  • text-search-babbage-query-001

  • text-search-ada-doc-001

Models were unavailable for one and a half hours. This was caused by an unanticipated maintenance event performed by Azure which took offline VMs that these models were served. Models were reconfigured to depend on VMs that were not taken offline by the maintenance event.

Root Cause:

A misconfiguration in our Azure activity log alerts caused this scheduled maintenance to occur without prior notice on our end.

Future Mitigations:

Evaluate the activity log notification configurations for our critical resources to ensure notifications are received in a timely manner.

Mon, Jul 11, 2022, 11:32 PM

Resolved

All models have recovered and are operational. Thank you for your patience.

Thu, Jul 7, 2022, 08:12 AM(4 days earlier)

Monitoring

We’ve moved affected models to other regions, and are seeing broad recovery. We are continuing to monitor while we investigate root cause.

Thu, Jul 7, 2022, 06:56 AM(1 hour earlier)

Identified

We have lost the ability to communicate with virtual machines in one of the regions in which we operate. We are mitigating by moving capacity to other regions. We are starting to see recovery on some affected models.

Thu, Jul 7, 2022, 06:44 AM(12 minutes earlier)

Identified

We have identified an issue affecting some Ada and Babbage models and are working on a remediation.

Thu, Jul 7, 2022, 06:07 AM(37 minutes earlier)
Powered by

Availability metrics are reported at an aggregate level across all tiers, models and error types. Individual customer availability may vary depending on their subscription tier as well as the specific model and API features in use.