Incident Playbook
Date | Version | Changes |
---|---|---|
2022-04-15 | v1.0.0 |
|
The incident playbook is a short set of clear instructions on how to start incident management in case of an outage.
1.1 You SHOULD have fire drills where you practice incident management, so the team knows about the incident playbook, where to find it, follow it and have the ability to exercise in a safe environment.
Incident Management
2.1 The Incident Response Workflow SHOULD be used to determine if the outage is an incident and what level of incident it is.
2.2 An incident manager MUST be assigned once an incident has been declared. The incident manager SHOULD NOT be a person that is actively working on the resolution.
2.3 The incident manager MUST assemble an incident response team.
2.4 The incident manager MUST inform the organization about the ongoing incident, and keep the organization informed during the incident.
2.5 The incident manager MUST make sure that a resolution is actively being worked on while the incident is ongoing.
2.6 The incident manager MUST record all events and communication during the incident for the postmortem.
2.7 Once the incident is resolved, or the outage is over, the incident manager MUST formally declare the incident as closed, and communicate this to the organization.
See Postmortem, on how the organization should use the incident as an opportunity for learning.