Abstract

Ineffective incident management workflows can affect revenue, team morale, and development velocity. There’s nothing more stressful than siloed communication while debugging a production outage.

Datadog Incident Management enables you to manage your team’s incident response workflows in a central location. Within Datadog, you’re able to declare incidents in response to alerts, investigate issues, coordinate incident response efforts, and remediate incidents, all without switching contexts or tools.

In this hands-on workshop, you’ll learn the ins and outs of managing incidents, work through a realistic incident scenario, and learn how to effectively use Datadog's integrations to automatically communicate incident status to your broader organization. You'll also learn how to leverage Incident Management, dashboards, and Notebooks to automatically populate postmortems with incident artifacts, such as assigned tasks, resolution steps, and dashboard snapshots.

Learning Objectives

Upon completing this workshop, you will be able to declare incidents across Datadog and directly from monitors. You will also learn how to use the Datadog Incident App to create and assign tasks, add relevant graphs to the incident timeline and automatically create a Postmortem.

Primary Audience

This workshop is designed for all people who use Datadog to monitor their applications, and that take part during an incident lifecycle and may act as Incident Responders such as DevOps Engineers, Software Engineers, System Administrators and SREs.

Prerequisites

Attendees will be expected to have a basic familiarity with Datadog Monitors, Dashboards and Notebooks. Basic understanding of incident response in general (i.e. working together as a team to resolve an issue that impacts customers) and a basic understanding of Slack.

Technical Requirements

In order to complete the course, you will need:

  • Google Chrome or Firefox
  • Third-party cookies must be enabled to access labs

Course Navigation

At the bottom of each lesson, click MARK LESSON COMPLETE AND CONTINUE button so that you are marked complete for each lesson and can receive the certificate at the end of the course.

Course Enrollment Period

Please note that your enrollment in this course ends after 30 days. You can re-enroll at any time and pick up where you left off.

Curriculum

    1. Introduction

    2. Known bugs

    1. Video Lesson

    2. Lab: Unified Incident Management

    3. Feedback

    1. Further Reading

    2. Slides

Unified Incident Management: Accelerating Time to Resolution (online)

  • 1.5 hours to complete
  • 1 Lesson
  • 0.5 hours of video content
  • Beginner