Service Incident August 15th 2016 Access issues on Pod 5

17:10 UTC | 10:10 PT
The access issues are now resolved and Zendesk services are restored.

16:24 UTC | 09:24 PT
Performance continues to improve, but our investigation is ongoing.

POST-MORTEM

Root cause investigation determined that translation micro-service hosts in Pod 5 were configured to run with 4 resource workers, but each host had only 2 workers configured, leading to severe resource starvation at the peak of weekday traffic. This caused a backlog of requests to pile up. The resolution of the issue came when the resource-starved hosts self-recovered as request traffic trended down about 30 minutes later. To prevent this issue from recurring, a change has been made to increase resource memory for our translation micro-service hosts in Pod 5.

FOR MORE INFORMATION

For current system status information about your Zendesk, check out our system status page. During an incident, you can also receive status updates by following @ZendeskOps on Twitter. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us.