OpenTelemetry GenAI Semantic Conventions: Vendor-Neutral Standard for Agent and LLM Traces

7 min read
OpenTelemetry GenAI Semantic Conventions: Vendor-Neutral Standard for Agent and LLM Traces
TL;DR

What Happened The OpenTelemetry GenAI Special Interest Group (SIG) has released the stable specification of the GenAI Semantic Conventions for LLM and agentic …

What Happened

The OpenTelemetry GenAI Special Interest Group (SIG) has released the stable specification of the GenAI Semantic Conventions for LLM and agentic traces. Operating within the semantic-conventions-genai repository, this release establishes a uniform telemetry schema that allows developers and site reliability engineers (SREs) to monitor, debug, and optimize complex AI agent applications without relying on proprietary instrumentation.

Historically, monitoring AI agents required custom integrations for each LLM provider, framework, and APM vendor. The new OTel conventions standardize how LLM requests, model inputs/outputs, agent planning loops, and tool execution phases are recorded as spans.

This release provides concrete attributes and trace structures for logging token usage (usage.input_tokens, usage.output_tokens), model parameters (gen_ai.request.model, gen_ai.response.model, gen_ai.request.temperature), and agentic runtime context (gen_ai.agent.name, gen_ai.tool.name, and gen_ai.tool.status).

OpenTelemetry GenAI Semantic Conventions banner — telescope icon amidst complex glowing neon blue and teal data trace lines on a very dark industrial background — OpenTelemetry 2026
OpenTelemetry GenAI Semantic Conventions establish a unified, vendor-neutral framework for tracing LLM invocations and agentic reasoning paths, ensuring end-to-end observability across the cloud-native AI stack.

Key capabilities introduced in the OTel GenAI Semantic Conventions standard:

  • Structured LLM Model Call Instrumentation: Standardizes span attributes for model name, request temperature, top-p, finish reason, and token usage, rendering them readable by any OTel-compliant backend.
  • Hierarchical Agent Spans: Defines conventions for tracing agent planning and loop execution, grouping sub-steps like tool calls under a parent agent span.
  • W3C Trace Context Propagation: Standardizes trace context passing across multi-agent environments using traceparent headers, facilitating tracing from front-end user actions down to background tools and model calls.
  • Model Context Protocol (MCP) Support: Provides a schema to instrument tool-calling systems governed by the Model Context Protocol (MCP), recording tool discovery, parameters, and outputs.
  • Compatibility Flagging: Implements OTEL_SEMCONV_STABILITY_OPT_IN environment variable integration, allowing existing OTel SDK instances to opt into the GenAI schemas.

Why It Matters

Observability as the Execution Gap Fix

In production, AI agents often fail silently or unpredictable loops occur. Without tracing, a developer only sees that a request timed out or returned a bad response. They cannot pinpoint whether the failure occurred due to an incorrect tool response, a model hallucination, or a latency bottleneck at the API gateway.

The OTel GenAI Semantic Conventions address this visibility gap. SREs can track the exact progression of an agentic workflow through a unified span hierarchy.

OpenTelemetry GenAI Span Hierarchy — Root span gen_ai.system branching into LLM call gen_ai.model.name and Agent span gen_ai.agent.name which further spawns a Tool span gen_ai.tool.name — OpenTelemetry 2026
The standardized span hierarchy organizes GenAI telemetry into logical, nested boundaries. The system span acts as the parent boundary, hosting model-call spans and agent execution loops. Agent spans house tool invocations as nested child spans, mapping the complete execution path for debugging.

With this hierarchy, APM platforms can automatically compute metrics such as:

  • TCO (Total Cost of Ownership) per Trace: Grouping usage.input_tokens and usage.output_tokens across multiple model calls in a single trace to calculate costs.
  • Tool Latency and Error Rate: Isolating gen_ai.tool.name spans to identify which tools cause bottlenecks or throw exceptions.
  • Hyperparameter Impact: Analyzing agent performance by comparing trace latency against hyperparameter values like temperature and top_p.

Distributed Trace Propagation in Multi-Agent Systems

Many enterprise AI architectures deploy multiple cooperating agents as microservices. For instance, a user-facing Router Agent may delegate tasks to a Research Agent, which in turn calls an API service.

OTel GenAI conventions define how the W3C Trace Context propagates across service boundaries. By passing the traceparent header, the entire session is grouped under a single trace ID, tracking execution flows regardless of network boundaries.

Multi-Agent Trace Propagation — Dotted line representing W3C traceparent header propagating from Service A Router Agent to Service B Research Agent and finally to the LLM model call — OpenTelemetry 2026
Distributed tracing links separate services into a single execution timeline. The traceparent context propagates from the user-facing agent to secondary services and LLM providers, ensuring observability across distributed agent clusters.

This eliminates tracing gaps in complex agent loops, allowing developers to trace the prompt from the user interface, through custom microservices, and into the model.

Code Example: Instrumenting a GenAI Span

Implementing these conventions is straightforward. The following code snippet demonstrates how to instrument an LLM call using OpenTelemetry's TypeScript API, aligning with the new GenAI semantic attributes:

import { trace, SpanStatusCode } from '@opentelemetry/api';

const tracer = trace.getTracer('my-agent-application');

async function callLanguageModel(prompt: string) {
  return tracer.startActiveSpan('gen_ai.model.call', {
    attributes: {
      'gen_ai.system': 'anthropic',
      'gen_ai.request.model': 'claude-3-5-sonnet',
      'gen_ai.request.temperature': 0.7,
      'gen_ai.prompt': prompt,
    }
  }, async (span) => {
    try {
      // Simulate API call to model provider
      const response = await modelProvider.generate({
        model: 'claude-3-5-sonnet',
        prompt,
        temperature: 0.7
      });

      // Record standard usage metrics on response completion
      span.setAttributes({
        'gen_ai.response.model': response.model,
        'usage.input_tokens': response.usage.prompt_tokens,
        'usage.output_tokens': response.usage.completion_tokens,
        'gen_ai.response.finish_reason': response.finish_reason
      });

      span.setStatus({ code: SpanStatusCode.OK });
      return response.text;
    } catch (error: any) {
      span.recordException(error);
      span.setStatus({
        code: SpanStatusCode.ERROR,
        message: error.message
      });
      throw error;
    } finally {
      span.end();
    }
  });
}

What to Watch Next

  • Framework Integrations: Watch for native OpenTelemetry GenAI exporters inside orchestrators like LangGraph, CrewAI, AutoGen, and Semantic Kernel. Native integration will allow developers to enable tracing with one environment variable instead of writing custom wrappers.
  • Observability Vendor Features: Trace-centric vendors like Datadog, Honeycomb, and Grafana are launching dedicated APM tabs for AI agents, leveraging these conventions to build dashboards for LLM usage, cost estimation, and tool latency.
  • Model Context Protocol (MCP) Auto-Telemetry: The MCP community is working to include OTel context propagation in the MCP protocol specification, which would automate tracing across any MCP server-client interaction.

Source

OpenTelemetry Semantic Conventions for GenAI Operations (2026)

Additional information: OpenTelemetry GenAI Spans Documentation

Related on shahvatsal.com:

Disseminate Knowledge

Broadcast this intelligence

Copy Permanent Link

Want to work together?

Technical and delivery consulting for engineering leaders — diagnostics, agentic AI, and transformation with measurable outcomes.