Service Disruption September 10th 2015

11:00 GMT+1 / 03:00 PST

We\'re currently investigating reports of UI problems for some customers, may include elements not loading or unresponsive. More to follow.

11:18 GMT+1 / 03:18 PST

Our Operations team is still looking into intermittent issues with Zendesk for some customers. Apologies for any inconvenience.

11:49 GMT+1 / 03:49 PST

Today\'s problems with Zendesk UI and certain functions have now been resolved. Post mortem to follow as soon as available.

13:59 GMT+1 / 05:59 PST

We are investigating new reports regarding intermittent UI problems causing performance issues for some customers. More info to follow.

14.37 GMT+1 / 06.37 PST

We are continuing to investigate reports regarding intermittent UI problems causing performance issues for some customers.

15.08 GMT+1 / 07.08 PST

We have determined that CDN issues are causing performance issues for some customers and are continuing to work on a resolution.

15.45 GMT+1 / 07.45 PST

We are continuing to work on the CDN issues causing performance problems for some customers. Update to follow shortly.

16.33 GMT+1 / 08.33 PST

Our operations team are continuing to work through issues impacting performance for some customers. 

17.30 GMT+1 / 09.30 PST

We are continuing to work on the CDN issues which are impacting some of our customers.

18.44 GMT+1 / 10.44 PST

We are further isolating the cause of performance and UI issues for some accounts. Additional updates each hour as remediation progresses.

20.04 GMT+1 / 12.04 PST

A fix was implemented for recent UI/performance issues. If impacted, please clear browser cache and let us know if the issue persists.

00:54 GMT+1 / 17:54 PST

The issues around the UI have no been resolved.  Post-mortem to follow.

POST-MORTEM

This incident was caused by a bug in the code of a Zendesk Apps Framework deploy on September 9th, 2015. The deploy contained a patch that would allow the framework to cancel timers from apps after they had been destroyed and prevent memory leaks, which have caused incidents in the past. Unfortunately, the bug caused timers outside of apps to be cancelled in unexpected scenarios. This led to a range of different UI glitches when certain conditions were met, most noticeably for accounts that had Time Tracking or Jira app installed and a ticket was submitted.

Symptoms of this included intermittent white/blank screen when transitioning tabs in Zendesk Agent UI, tickets not resolving or assigning to others, comments not sticking, objects disarranged in the Agent UI and other undesired effects manifesting on and off. A rollback of the Zendesk App Framework appeared to resolve the problem, and a final test 24h after the incident (a roll-forward + posterior patching) confirmed it had a bug causing the trouble.

The bug could not be detected in testing as the particular conditions that had to be met for the bug to occur would’ve been difficult to isolate through automated tests. We could have caught this during QA, however, our current test plans don’t include the scenarios that triggered the issue. This has been added as a remediation item following this incident and already incorporated into our QA flow.

 

FOR MORE INFORMATION