Use response plans to plan for incidents and define how to respond to incidents. Response plans provide a template for when an incident occurs. This template includes information about who to engage, the expected severity of the event, automatic runbooks to initiate, and metrics to monitor. To create a response plan, use the following steps.
Taking the time to plan for incidents ahead of time saves operational time for teams later down the road. Teams should consider the following best practices when designing a response plan.
- Streamlined engagement – Identify the most appropriate team for an incident. Engaging wide distribution lists or the wrong teams causes confusion and wastes responder time during incidents.
- Reliable escalation – Using escalation plans rather than contacts ensures that responders are effectively and reliably engaged. Even with the best intentions, responders are sometimes unreachable. Having a backup responder configured in an escalation plan covers these scenarios.
- Runbooks – Developing runbooks that provide repeatable and understandable steps helps reduce the stress responders experience during incidents.
- Collaboration – Use chat channels to streamline communication during incidents. Chat channels help responders stay up to date with information and also share information with other responders.
Using the response plan best practices and the Incident Manager console, create dynamic response plans to automate incident response.
Create a response plan.
-
Open the Incident Manager console, and in the left navigation, choose Response plans.
-
Choose Create response plan.
-
Enter a unique and identifiable response plan Name. The response plan name only contains alpha-numeric, hyphen, and underscore characters.
-
(Optional) Enter a Display name. Use the display name to provide a more user-friendly name to the response plan.
-
Enter an incident title. The incident title helps to identify an incident on the incidents home page.
-
To indicate the potential scope of the incident, choose an Impact.
-
(Optional) Provide a brief description of the incident.
-
(Optional) Provide a dedupe string. Incident Manager uses the dedupe string to prevent the same root cause from creating multiple incidents.
-
Choose a chat channel for the incident responders to interact in during an incident. For more information about chat channels, see Chat channels. Important
Incident Manager must have permissions to publish to the chat channel's SNS topic. Without permissions to publish to the SNS topic, you can't add it to the response plan. Incident Manager verifies permissions by publishing a test notification to the SNS topic. -
(Optional) Choose additional SNS topics to publish to during the incident. Adding SNS topics in multiple Regions increases redundancy in case a Region is down at the time of the incident.
-
For Engagement, choose any number of contacts and escalations plans. For information about contact and escalation plan creation, see Contacts and Escalation plans.
-
To select a Runbook:
- Choose Select an existing runbook. Select the Owner, Runbook, and Version. For information about runbook creation, see Runbooks and automation.
- Choose Clone runbook from template. Enter a descriptive runbook name.
-
Either choose an existing role or use the following steps to create a new role. If you choose an existing role it must allow Incident Manager to start an automation execution for you.
-
Open the IAM console at https://console.aws.amazon.com/iam/.
-
Choose Roles from the left navigation and choose Create role.
-
Select Incident Manager and choose the Incident Manager use case.
-
Choose Next: Permissions
-
Choose Create policy and then choose the JSON tab.
-
Copy and paste the following JSON blob describing the policy into the JSON editor.
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Resource": "arn:aws:ssm:*:111122223333:automation-definition/DocumentName:*", "Action": "ssm:StartAutomationExecution" }, { "Effect": "Allow", "Resource": "arn:aws:iam::*:role/AWS-SystemsManager-AutomationExecutionRole", "Action": "sts:AssumeRole" } ] } -
Choose Next: Tags
-
(Optional) Add tags to your policy.
-
Choose Next: Review.
-
Provide a Name and optionally provide a Description for the policy.
-
Choose Create policy.
-
Navigate back to the role you were creating and select for the policy you created.
-
(Optional) Add tags to your role.
-
Provide a Role name and (optional) update the Role description.
-
Choose Create role.
-
Navigate back to the response plan you are creating and refresh the Role name dropdown.
-
Select the role you created in the IAM console.
-
-
Choose the Execution target.
-
(Optional) Add tags to your response plan.
-
Choose Create response plan.