LLM Observability Metrics & Traces Learning Path
Follow this curated path to build observability into your LLM applications and develop practical skills in instrumenting, monitoring, and troubleshooting model-driven systems using Datadog LLM Observability.
Through hands-on courses, you’ll learn to enable LLM Observability with auto-instrumentation, manually instrument complex multi-step workflows, and use metrics and traces to move from high-level signals to root causes. You’ll learn how to capture meaningful span data, interpret LLM-specific metrics, and analyze latency, errors, and token usage in your own applications.
This path is designed for software developers, AI engineers, and DevOps engineers who build or maintain LLM-based systems and need clear visibility into model behavior, workflow execution, and performance.
You’ll learn how to do the following:
Getting Started with LLM Observability
Build observability into an LLM application. Monitor LLM performance and costs. Explore trace data with prompt inputs and response outputs. Analyze token usage and latency metrics. Identify errors and discover root causes.
Tracing LLM Applications
Trace key operations for end-to-end-visibility in multi-step pipelines. Visualize execution flows in complex LLM chains. Debug failures and understand model behavior using detailed traces, contextual annotations, and performance metrics.
Investigate with LLM Observability
NEW! Investigate LLM application issues using metrics and traces. Move from high-level metrics to individual traces to identify the root cause of latency problems, silent pipeline failures, and quality issues behind successful operational metrics.