Getting Started with Incident Management
Learn how to manage incidents using Datadog Incident Management. By the end of this course, you'll know how to set up Incident Management, detect and declare incidents, and guide your team through resolution.
Production incidents are inevitable. When systems fail, how effectively your team responds can mean the difference between a minor disruption and a major outage. Datadog Incident Management gives you a central place to declare, track, and manage incidents from detection to resolution.
In this course, you'll work through a realistic incident scenario at a fictional e-commerce company, Storedog. You'll configure monitor notifications and incident notification rules so the right people are alerted when issues arise. When high error rates hit two services, you'll declare an incident, investigate the root cause, and share your findings with your team through the incident workbench and timeline. After resolving the incident, you'll generate an AI-powered postmortem and review incident analytics. From there, you'll identify improvements to your incident response process—configuring a status page and setting up incident automations.
By the end of this course, you will be able to:
This course is designed for engineers who use Datadog to monitor their applications and are involved in the incident response lifecycle. It's particularly suitable for DevOps Engineers, Software Engineers, Site Reliability Engineers (SREs), and Engineering Managers who serve as Incident Responders or Incident Commanders.
The prerequisites for this course are the following:
In order to complete the course, you will need:
At the bottom of each lesson, click MARK LESSON COMPLETE AND CONTINUE button so that you are marked complete for each lesson and can receive the certificate at the end of the course.
Please note that your enrollment in this course ends after 30 days. You can re-enroll at any time and pick up where you left off.
Introduction to Incident Management
Incident Management Best Practices
Datadog Incident Management
Lab: Datadog Incident Management
Summary
Feedback Survey