Information on what to do after a major incident. Our follow-up and after action review procedures.

Follow-up Actions for Response Roles#

In addition to any direct follow-up items generated from an incident, each of our response roles will have a few standard follow-up tasks. These are generally lightweight actions that ensure we organize information and follow up appropriately.

Steps for Incident Commander#

  1. Update the incident in incident.io.

    • Set the final severity of the incident.
    • Resolve the incident.
  2. Create the postmortem, and assign an owner to the postmortem for the incident.

  3. Send out an internal message in Slack to the relevant stakeholders explaining that we had an incident, and provide a link to the postmortem.

  4. Occasionally check on the progress of the postmortem to ensure that it is completed within the desired time frame.

Steps for First Responder (Scribe)#

  1. Review the Slack communications and extract any relevant items from key events.

  2. Collect all TODO items and add them to the postmortem.

Steps for Subject Matter Experts#

  1. Add any notes you think are relevant to the postmortem.

Steps for Communication Commander#

  1. Reply to any user or partner enquiries we received about the incident.

  2. Follow the postmortem progress, and update status pages and community channels with the external message once it is available.

Steps for Internal Liaison#

There are no additional steps after an incident is resolved. However the IC may ask for your help with answering questions from internal stakeholders.

Reviewing the Incident#

It's important that we review the incident in detail to see exactly what went wrong, why it went wrong, and what we can do to make sure it doesn't happen again. These take many names: after-action reviews, incident reviews, follow-up reviews, etc. We use the term postmortem.

You can read all about our postmortem process, which goes over this in more detail.

Reviewing the Process#

As well as reviewing the incident, it's important to review our process. Did we handle the incident well, or are there things we could have done better?

This review typically involves the Incident Commander and key responders discussing how we might have done things differently, or if there are any tweaks we can make to our incident response process.

If you're interested in joining these discussions, just let the IC know and they'll be sure to include you.