Free to read. Sign up to save your progress and take knowledge-check quizzes.

Sign up free
5 min read·Updated March 27, 2026

Datadog LLM Observability

Datadog logoBy Datadog

Datadog LLM Observability monitors AI application performance, cost, and quality — tracking LLM calls, token usage, latency, and error rates alongside full-stack infrastructure metrics in a single platform.

Listen to this lesson

Free preview · first 0:30
0:00 / 0:30

Audio & video lessons are paid features

Plus unlocks audio streaming. Pro adds downloadable audio, video, certificates, and more.

Plus adds:
  • Audio streaming
  • Downloadable PDFs
  • All AI Playbooks
  • Personalized content
Pro also adds:
  • Certificates of completion
  • Audio MP3 downloads
  • Video lessonssoon
  • & More…soon

Watch this lesson

Video coming soon

Learning Objectives

  • Understand what LLM observability is and why it matters for production AI applications
  • Identify Datadog's key LLM monitoring features including tracing, cost tracking, and agentic AI monitoring
  • Compare Datadog LLM Observability to purpose-built alternatives like LangSmith and Helicone

What Is LLM Observability?

When you deploy an AI application in production, you need to know: Is it working? How much is it costing? Are responses accurate? How fast is it? LLM Observability answers these questions by monitoring every interaction between your application and AI models.

Datadog LLM Observability extends Datadog's industry-leading monitoring platform to cover AI workloads. It automatically traces every LLM call — capturing latency, token usage, estimated cost, error rates, and response quality — and correlates this data with your existing infrastructure metrics, application traces, and logs.

💡Key Concept

Observability vs. Monitoring: Monitoring tells you when something is wrong (an alert fires). Observability tells you why — by providing the detailed traces, metrics, and logs needed to diagnose problems. For AI applications, observability means seeing exactly which LLM call in a multi-step agent workflow caused a failure, how much each call cost, and how the AI's behavior changed after a prompt update.

Core Features

LLM Call Tracing

Automatic tracing and annotation of every LLM call — no code changes required. Each trace captures:

  • Latency — how long the model took to respond
  • Token usage — input and output tokens consumed
  • Estimated cost — calculated from provider pricing and token counts
  • Error rates — failed calls, timeouts, rate limits
  • Full request/response content — for debugging and evaluation

Execution Flow Charts

Visual diagrams showing agent decision paths, tool usage, and retrieval steps. See exactly how a multi-step AI agent navigated a complex task — which tools it called, what data it retrieved, and where it decided to branch.

AI Agents Console (June 2025)

A dedicated dashboard for monitoring AI agents in production:

  • Track actions, security posture, and performance of any AI agent
  • Monitor user engagement and business value metrics
  • Works with both custom-built and third-party agents
  • Visibility into agentic workflows spanning multiple models and tools

LLM Experiments (June 2025)

A structured experimentation framework for testing changes before shipping to production:

  • Compare prompt changes, model swaps, and configuration updates
  • Measure impact on quality, latency, and cost
  • Prove results before rolling out to users

Bits AI Copilot

Datadog's built-in AI assistant that queries across all your observability data using natural language:

  • Identifies root causes "90% faster" than manual investigation
  • Integrates into Slack incident response channels with automatic summaries
  • Can automate alert investigations, code fixes, and security triage

Supported LLM Providers

LanguageSupported Providers
Python SDKOpenAI; Anthropic; AWS Bedrock; LangChain; Google Vertex AI
Node.js SDKOpenAI; Anthropic; Azure OpenAI; AWS Bedrock; Google Vertex AI; LangChain; Vercel AI SDK
OpenTelemetryVendor-neutral via GenAI Semantic Conventions (any provider)

Additional integrations include GitHub Copilot usage tracking, Microsoft Copilot monitoring, LiteLLM gateway tracing, and cloud cost management for Anthropic and GitHub spend.

Pricing

Datadog LLM Observability is billed per LLM span (each call to an LLM provider counts as one span; a single user request may generate multiple spans).

⚠️Warning

LLM Observability is an add-on to Datadog's platform — there is no standalone free tier. Pricing is not fully transparent on the public pricing page; enterprise customers typically negotiate custom rates. Third-party estimates suggest approximately $8 per 10,000 requests, but verify current rates at datadoghq.com/pricing.

For teams that only need LLM monitoring without full-stack observability, purpose-built tools like Helicone (open-source, generous free tier) or LangSmith ($39 per user per month) offer much lower entry points.

Datadog LLM Observability vs. Competitors

PlatformBest ForKey Advantage
Datadog LLM ObservabilityEnterprise teams already on DatadogFull-stack correlation: LLM + infrastructure + APM + logs in one platform
LangSmithLangChain/LangGraph usersZero-config for LangChain; excellent debugging; $39/user/month
HeliconeStartups and lightweight LLM loggingOpen-source; 1-line proxy integration; generous free tier
Arize AIML teams needing evaluation and drift detectionStrong evaluation metrics; MLOps heritage
New RelicEnterprise teams already on New RelicConsumption-based pricing; full-stack monitoring

Datadog's unique advantage: It is the only platform that correlates LLM performance with the entire application stack — APM traces, infrastructure metrics, logs, cloud costs, and security signals — in a single pane of glass.

Company Details

DetailInfo
CompanyDatadog Inc. (NASDAQ: DDOG)
Founded2010
CEOOlivier Pomel (co-founder)
HeadquartersNew York, New York
Employees~9,700
Revenue (FY2025)$3.43 billion (+28% year-over-year)
2026 Revenue Guidance$4.06-$4.10 billion
Free Cash Flow (FY2025)$915 million
Market Cap~$44-46 billion
Total Customers~32,700
Fortune 500 Penetration48%
Million-Dollar Customers603 (+31% year-over-year)
Websitedatadoghq.com

Strengths

  • Full-stack correlation — the only LLM monitoring tool that integrates with infrastructure, APM, logs, security, and cloud costs in one platform
  • No-code instrumentation — automatic tracing of LLM calls without code changes for major providers
  • Agentic AI monitoring — dedicated AI Agents Console and experiment framework for testing changes safely
  • Bits AI copilot — natural language querying across all observability data for faster incident response
  • Enterprise scale — 32,700 customers, 48% of Fortune 500, $3.43 billion revenue

Limitations and Considerations

  • Cost — Datadog is expensive; LLM Observability is an add-on to an already premium platform with no standalone free tier
  • Platform lock-in — most valuable when you are already a Datadog customer using APM, logs, and infrastructure monitoring
  • Pricing opacity — per-span billing is not clearly published; costs can escalate quickly with high-volume AI applications
  • Overkill for simple use cases — if you only need to track LLM costs and latency, Helicone or LangSmith are simpler and cheaper
  • LLM-specific features are newer — purpose-built tools like LangSmith have deeper LLM debugging and evaluation capabilities

Key Takeaways

  • Datadog LLM Observability monitors AI application performance by tracking every LLM call — latency, tokens, cost, errors — and correlating with full-stack infrastructure metrics
  • The AI Agents Console and LLM Experiments features (launched June 2025) enable monitoring agentic AI workflows and testing changes before production
  • Most valuable for enterprise teams already using Datadog who want to add AI monitoring without adopting another vendor
  • For LLM-only monitoring without full-stack needs, purpose-built tools like Helicone (free, open-source) or LangSmith ($39 per user) are more cost-effective alternatives

Save your progress & take the quiz

Sign up free to bookmark lessons, track which modules you've completed, and lock in what you learned with a quick knowledge-check quiz at the end of each lesson.

🧭Recommended for you