Service Incident July 18th 2016 500 errors on api v2 gooddata user json

09:15 UTC | 02:15 PT
The Insights availability issues affecting some of our POD3 customers has now being resolved.

08:42 UTC | 01:42 PT
We are seeing a reduction in connection errors at this time. We continue to work with our analytics partner for full resolution.


This post-mortem summary applies to incidents occurring on July 16 and 18, 2016.

On July 17th, 2016, after a planned GoodData platform release, we started to receive intermittent network connection failures between Zendesk and GoodData as well as an unusual reduction in network throughput. During the incident GoodData reports they tried several remedial actions to address the problem, including reversion of some deployed components, and multiple re-configurations and re-balancing of the network gateway resources. Despite these actions, the GoodData team was unable to identify the cause of the connection failures. Instead, the connection and performance issues subsided for no clear reason. When this release was re-deployed with the same configuration, no network issues were found even after significant additional testing/monitoring. GoodData has not ruled out the possibility that the cause was extraneous to its release.

Absent solid evidence of cause, we have identified improvements to our staging environment to more closely match production and increase likelihood issues will be found prior to production release. The Zendesk and GoodData teams are also working to tighten notification and escalation processes to improve communication and reduce time to resolution.


For current system status information about your Zendesk, check out our system status page. During an incident, you can also receive status updates by following @ZendeskOps on Twitter. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us.