Service Incident July 24th 2016 Help Center search errors pod 4

18:38 UTC | 11:38 PDT  

Search issues have been resolved, Once a post mortem is available it will be posted below.

18:06 UTC | 11:06 PDT  

Services have been restored. New search results will be available soon.

16:51 UTC | 09:51 PDT  

Services continue to recover. We'll tweet again once recovery is complete.

15:54 UTC | 08:54 PDT  

In order to speed up recovery for affected customers, Customers in pod 4 won’t see recent items until searches fully recover

14:54 UTC | 07:54 PDT  

Operations continues to investigate issue. Please check status http://status.zendesk.com for updates

13:35 UTC | 06:35 PDT  

12:27 UTC | 05:27 PDT  

Operations team is working to mitigate the the error and resolve as soon as possible. Update in one hour

11:13 UTC | 04:13 PDT  

Performance has improved in POD4 but we still see minor errors, our team is working fixing the services. Next update in 1 hour.

09:57 UTC | 02:57  PDT  

We are still seeing some minor issues related to Search service and Help Center, our team continues to work to fully repair the services.

09:16 UTC | 02:16 PDT  
We are seeing a reduction in errors at this time. Our operations team continue to investigate. http://status.zendesk.com

08:37 UTC | 01:37 PDT  
Search functionality has improved for Help Center customers in Pod 4. Our operations team continue to investigate. http://status.zendesk.com

08:17 UTC | 01:17 PDT
Pod 4 customers have reported help center unavailability. We are investigating and will provide an update shortly. http://status.zendesk.com

POST-MORTEM

During this incident customers experienced errors when using the search feature in Help Center. The incident began during a new deploy of the Help Center search service which resulted in an extremely high load on the client nodes. The changes were rolled back, additional client nodes added, and then the changes re-deployed without incident. While we were not able to determine which deployed file was at issue, adding more capacity allowed for successful deployment. To minimize risk of recurrence, we will increase the size of our test environments and move to a staged deploy process for our search service.

FOR MORE INFORMATION

For current system status information about your Zendesk, check out our system status page. During an incident, you can also receive status updates by following @ZendeskOps on Twitter. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us.