Call it bad luck, or Murphy's Law, but the leap year will not be a great memory for those responsible for managing the the Atlassian Cloud platform. According to Atlassian's support post (found here), there was a major cloud incident, that at the time of publishing this article, has not been fully resolved.
For effected users (and the scope so far seems very wide and far reaching), the simple way to explain the incident is that automation rules inside their Jira sites simply disappeared.
The image below is from the Atlassian Service Status Page as at February 29, 18:00PST. Any Jira Admins should watch this page and subscribe for updates to ensure that you are fully aware of resolution progress and plans by Atlassian : https://jira-service-management.status.atlassian.com/
Automation Rules allow you to automate actions within your system based on criteria that you set. Automation rules are made up of three parts: triggers that kick off the rule, conditions that refine the rule, and actions that perform tasks in your site.
https://support.atlassian.com/cloud-automation/docs/create-and-edit-jira-automation-rules/
As automation rules can be used creatively by Atlassian customers for just about anything, their loss can and will have been having catastrophic impact to customers who are using them for core business processes in their Jira/JSM sites.
Examples of this are;
As an expert in system and data recovery of Jira Cloud, we have been inundated with requests from a variety of Jira sites to help them make a recovery plan and to get their automations recovered and working again.
On March 29th, we invited The expert crew from The Jira Life team of Rodney Nissen, Alex Ortiz and Robert Wen to jump onto a livestream with us to help the Atlassian Community to recover from this incident.
We put the recording of this below and then decided (if you scroll down) to also start this blog page with a section of actionable steps for Jira Admins to take in order to recover as safely as possible and to minimize further business impact.
Disclaimer: the information in this article is a set of guidelines and ideas and are not based on any inside knowledge of this evolving Atlassian Cloud incident. With that said, the advice below should be taken in general terms and each administrator should take extra care to follow Atlassian's recommendations, to adhere to careful change management processes and if unsure, to defer to your qualified solution partner experts and/or the Atlassian support team.
Talk to Your End Users
This incident has already lasted over one business day so it is clearly significant. It may be that some sites will be effected for a long time so get in touch with your end users and management and;
If you can restore from backup, do it with care
When you restore or rebuild workflows - switch them on one at a time
Each workflow that you restore or rebuild should be turned on individually, and take care to look at dependencies and other factors when you turn a workflow back on. Test each one carefully before switching on another so that you can limit and minimize the impact of any unexpected behaviour.
Make a backup of what you've got - even if its not working
Even if its not working, it is a good practice to make a backup of both your data and your configuration in Jira Cloud before performing any rectification of automations, batch updates to tickets or anything that you are doing to resolve the situation.
Emergency repair jobs sometimes don't go to plan and its great to have a definitive restore point to go back to if data is damaged.
Announcement for Effected Jira AdminsWe have decided to provide free access to our backup and clone tool for Jira Cloud configuration with an included, extended trial of 60 days for any effected Jira Sites. This will enable you to safely and securely back up and protect your Jira Automations immediately. |
We will be updating this page with more resources, advice and instructions aimed at helping you, so bookmark it and come back to it in future.
In a somewhat serendipitous move, Vish Reddy and Stu Lees from Revyz recorded this demo video only two weeks ago.