Learning Path

LLM Observability Metrics & Traces Learning Path

Follow this curated path to build observability into your LLM applications and develop practical skills in instrumenting, monitoring, and troubleshooting model-driven systems using Datadog LLM Observability.

Through hands-on courses, you’ll learn to enable LLM Observability with auto-instrumentation, manually instrument complex multi-step workflows, and use metrics and traces to move from high-level signals to root causes. You’ll learn how to capture meaningful span data, interpret LLM-specific metrics, and analyze latency, errors, and token usage in your own applications.

This path is designed for software developers, AI engineers, and DevOps engineers who build or maintain LLM-based systems and need clear visibility into model behavior, workflow execution, and performance.

You’ll learn how to do the following:

Enable LLM Observability in an application and understand the prompt, response, token, latency, and error data it captures
Instrument multi-step LLM workflows with manual tracing and annotations to make complex pipelines observable
Interpret LLM-specific metrics such as error rate, latency percentiles, token usage, and model distribution
Analyze end-to-end traces to understand workflow structure and where time is spent across agent, tool, retrieval, and LLM operations
Query trace details to diagnose errors, silent failures, and unexpected model behavior
Move from high-level metrics and dashboards to filtered traces to identify root causes of performance and quality issues

Getting Started with LLM Observability

Build observability into an LLM application. Monitor LLM performance and costs. Explore trace data with prompt inputs and response outputs. Analyze token usage and latency metrics. Identify errors and discover root causes.

View Course

Tracing LLM Applications

Trace key operations for end-to-end-visibility in multi-step pipelines. Visualize execution flows in complex LLM chains. Debug failures and understand model behavior using detailed traces, contextual annotations, and performance metrics.

View Course

Investigate with LLM Observability

NEW! Investigate LLM application issues using metrics and traces. Move from high-level metrics to individual traces to identify the root cause of latency problems, silent pipeline failures, and quality issues behind successful operational metrics.

View Course

Leave feedback about your experience in our Learning Path Survey.

Complete all courses in the path to earn your Credly badge