Service Incident February 25th 2016

09:09 UTC | 01:09 PT
We are happy to report that the search issues affecting some customers are now resolved.

09:01 UTC | 01:01 PT
We are beginning to see improvements with search in our US West Coast Data Center.

POST-MORTEM

During a planned upgrade of Elastic Search, one of the servers in our US West Coast data center became unresponsive. After the upgrade finished, search returned to normal. The upgrade plan was one used previously in other servers without issue; however, with this particular data center being the largest, with some of its indices reaching 500GB in size, the upgrade process is very resource-heavy and affected the search performance of the entire server. We re-grouped and modified the upgrade plan to address this issue, later completing the upgrade without any service impact.

Furthermore, the version of Elastic Search that we were upgrading from (1.6) depends on the availability of all the servers in a cluster to determine the cluster health, thus if one server is "not responding", Elastic Search engine will respond that the cluster is also not healthy, thus causing the service to be unavailable when queried. Upgrading to the later version is going to help minimize this behavior. Unfortunately, this behavior is still true until all servers in the cluster have been upgraded. We continue working to upgrade our search service to fully stabilize its performance.

FOR MORE INFORMATION

Please subscribe to this article for regular updates until the issue is resolved. If you aren't subscribed to our Twitter feed, we encourage you to do so in order to get the most current information about any service issues. We also record all site outages on our system status page where you can see the past 12 months of service uptime. If you have questions about this issue, please open a ticket with us by sending a note to support@zendesk.com.