Write-up published
Resolved
On July 6, 2022 22:17 PDT \(July 7 05:17 UTC\), the following models had partial or complete outages:
text-similarity-ada-001
text-search-babbage-doc-001
content-filter-alpha
text-search-babbage-query-001
text-search-ada-doc-001
Models were unavailable for one and a half hours. This was caused by an unanticipated maintenance event performed by Azure which took offline VMs that these models were served. Models were reconfigured to depend on VMs that were not taken offline by the maintenance event.
A misconfiguration in our Azure activity log alerts caused this scheduled maintenance to occur without prior notice on our end.
Evaluate the activity log notification configurations for our critical resources to ensure notifications are received in a timely manner.
Resolved
All models have recovered and are operational. Thank you for your patience.
Monitoring
We’ve moved affected models to other regions, and are seeing broad recovery. We are continuing to monitor while we investigate root cause.
Identified
We have lost the ability to communicate with virtual machines in one of the regions in which we operate. We are mitigating by moving capacity to other regions. We are starting to see recovery on some affected models.
Identified
We have identified an issue affecting some Ada and Babbage models and are working on a remediation.