Abstract

Kubernetes environments can be complex and opaque, making it difficult to find and fix problems quickly. To act with confidence you need fast, clear insights. With Datadog, you get real-time visibility into every layer of your Kubernetes stack, from nodes to pods to services. 

This course introduces you to Datadog's Kubernetes observability features through a hands-on SRE scenario where you'll investigate real performance issues and implement fixes to restore service stability.

Learning Objectives

By the end of this course, you will be able to:

  • Navigate the Kubernetes Explorer and interpret pod metrics, events, and resource utilization
  • Distinguish between metrics (what's happening) and events (why it's happening)
  • Troubleshoot performance issues and make capacity planning decisions

Primary Audience


This course is for Site Reliability Engineers (SREs), DevOps Engineers, Platform Engineers, and any Datadog user who will monitor and troubleshoot Kubernetes environments.

Prerequisites

  • Basic familiarity with Kubernetes concepts (pods, services, deployments)
  • Ability to use a Linux command line shell
  • Familiarity with editing YAML in an IDE

Technical Requirements

  • Chrome or Firefox with 3rd party cookies enabled
  • A network connection that doesn’t block web sockets or other technologies.

Course Navigation

Mark each lesson complete by clicking the MARK LESSON COMPLETE & CONTINUE button at the bottom to track your progress and earn the course certificate.

Course Enrollment Period

Please note that your enrollment in this course ends after 30 days. However, you can re-enroll at any time and pick up where you left off.

Course curriculum

    1. Kubernetes Observability Overview

    1. Lab: Getting Started with Kubernetes Observability

    1. Kubernetes Observability Interfaces

    1. Additional Resources & Materials

    2. Feedback Survey

Getting Started with Kubernetes Observability

  • 2 hours to complete
  • 6 Lessons
  • Intermediate