AI agents fail in ways that logs don't capture. The agent called the right function, got a valid response, then produced the wrong output. By the time you notice, the trace is gone.
OpenTelemetry fixes this. Here's the full setup for a Claude-based agent.
The Problem With Console.log Debugging
A typical agent debugging session:
- User reports wrong output
- You add
console.logat suspected failure points - Reproduce the failure (if you can)
- Find the log line, add more logs around it
- Repeat
This works for synchronous code. For agents that run multi-step workflows, call tools in parallel, or execute asynchronously — it breaks down. You can't correlate log lines across steps without request IDs threaded through every call.
OpenTelemetry gives you distributed tracing: every step of agent execution is a span, spans are linked into a trace, and you can visualize the full execution tree.
Setup: Jaeger + OTEL SDK
Run Jaeger locally:
docker run -d --name jaeger \
-p 16686:16686 \
-p 4318:4318 \
jaegertracing/all-in-one:latest
Install OTEL packages:
npm install @opentelemetry/sdk-node @opentelemetry/auto-instrumentations-node \
@opentelemetry/exporter-trace-otlp-http @opentelemetry/api
Create the tracer setup (load before anything else):
// instrumentation.ts
import { NodeSDK } from '@opentelemetry/sdk-node'
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http'
import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node'
import { Resource } from '@opentelemetry/resources'
import { SEMRESATTRS_SERVICE_NAME } from '@opentelemetry/semantic-conventions'
const sdk = new NodeSDK({
resource: new Resource({
[SEMRESATTRS_SERVICE_NAME]: 'claude-agent',
}),
traceExporter: new OTLPTraceExporter({
url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT ?? 'http://localhost:4318/v1/traces',
}),
instrumentations: [
getNodeAutoInstrumentations({
'@opentelemetry/instrumentation-http': { enabled: true },
'@opentelemetry/instrumentation-fetch': { enabled: true },
}),
],
})
sdk.start()
process.on('SIGTERM', () => sdk.shutdown())
Instrumenting the Agent
// lib/agent/traced-agent.ts
import Anthropic from '@anthropic-ai/sdk'
import { trace, SpanStatusCode } from '@opentelemetry/api'
const tracer = trace.getTracer('claude-agent', '1.0.0')
const client = new Anthropic()
interface Tool {
name: string
description: string
input_schema: object
execute: (input: unknown) => Promise<unknown>
}
export async function runAgent(userMessage: string, tools: Tool[], sessionId: string) {
return tracer.startActiveSpan('agent.run', async (rootSpan) => {
rootSpan.setAttributes({
'agent.session_id': sessionId,
'agent.user_message': userMessage.slice(0, 200),
})
try {
const messages: Anthropic.MessageParam[] = [{ role: 'user', content: userMessage }]
let iteration = 0
while (iteration < 10) {
const response = await tracer.startActiveSpan('agent.llm_call', async (llmSpan) => {
llmSpan.setAttributes({
'llm.model': 'claude-sonnet-4-6',
'llm.iteration': iteration,
'llm.message_count': messages.length,
})
const result = await client.messages.create({
model: 'claude-sonnet-4-6',
max_tokens: 4096,
tools: tools.map(t => ({
name: t.name,
description: t.description,
input_schema: t.input_schema as Anthropic.Tool['input_schema'],
})),
messages,
})
llmSpan.setAttributes({
'llm.input_tokens': result.usage.input_tokens,
'llm.output_tokens': result.usage.output_tokens,
'llm.stop_reason': result.stop_reason ?? '',
})
llmSpan.end()
return result
})
if (response.stop_reason === 'end_turn') {
const output = response.content
.filter(b => b.type === 'text')
.map(b => (b as Anthropic.TextBlock).text)
.join('')
rootSpan.setAttribute('agent.output', output.slice(0, 500))
rootSpan.setStatus({ code: SpanStatusCode.OK })
rootSpan.end()
return output
}
const toolUses = response.content.filter(b => b.type === 'tool_use')
messages.push({ role: 'assistant', content: response.content })
const toolResults = await Promise.all(
toolUses.map(async (block) => {
const toolBlock = block as Anthropic.ToolUseBlock
const tool = tools.find(t => t.name === toolBlock.name)
return tracer.startActiveSpan(`agent.tool.${toolBlock.name}`, async (toolSpan) => {
toolSpan.setAttributes({
'tool.name': toolBlock.name,
'tool.input': JSON.stringify(toolBlock.input).slice(0, 500),
})
try {
const result = await tool!.execute(toolBlock.input)
toolSpan.setStatus({ code: SpanStatusCode.OK })
toolSpan.end()
return {
type: 'tool_result' as const,
tool_use_id: toolBlock.id,
content: JSON.stringify(result),
}
} catch (err) {
toolSpan.setStatus({ code: SpanStatusCode.ERROR, message: String(err) })
toolSpan.recordException(err as Error)
toolSpan.end()
return {
type: 'tool_result' as const,
tool_use_id: toolBlock.id,
content: `Error: ${String(err)}`,
is_error: true,
}
}
})
})
)
messages.push({ role: 'user', content: toolResults })
iteration++
}
throw new Error('Max iterations reached')
} catch (err) {
rootSpan.setStatus({ code: SpanStatusCode.ERROR, message: String(err) })
rootSpan.recordException(err as Error)
rootSpan.end()
throw err
}
})
}
What You See in Jaeger
After running a few agent calls, open http://localhost:16686. Select the claude-agent service and pick any trace. You'll see:
agent.run (340ms)
├── agent.llm_call [iteration=0] (210ms)
│ input_tokens=847, output_tokens=312
├── agent.tool.search_documents (45ms)
│ query="invoice #1234"
├── agent.tool.get_customer (23ms)
│ customer_id="cust_abc"
├── agent.llm_call [iteration=1] (180ms)
│ input_tokens=1204, output_tokens=89
└── [end_turn]
When a tool fails, the span turns red. When the LLM loops unexpectedly, you see the iteration count climb. Token costs per session are visible without any extra instrumentation.
Production Considerations
- Sample aggressively — trace 10% of traffic, 100% of errors
- Redact PII — never put user content in span attributes; use hashed IDs
-
Set span limits — truncate
agent.outputto 500 chars to prevent attribute size errors -
Use baggage for session ID — propagate session_id through async boundaries with
context.with()
Full Observability Stack
OpenTelemetry traces + structured logs + Stripe event webhooks give you the complete picture of every agent session. This pattern is built into the Workflow Automator MCP — it adds tracing to any Claude agent running in the IDE.
- Workflow Automator MCP — $15/mo — pre-built OTEL instrumentation for Claude agent loops
- AI SaaS Starter Kit — $99 one-time — full production agent stack with tracing, auth, and billing
United States
NORTH AMERICA
Related News
UCP Variant Data: The #1 Reason Agent Checkouts Fail
7h ago
Amazon Employees Are 'Tokenmaxxing' Due To Pressure To Use AI Tools
21h ago
How Braze’s CTO is rethinking engineering for the agentic area
10h ago

Décryptage technique : Comment builder un téléchargeur de vidéos Reddit performant (DASH, HLS & WebAssembly)
17h ago
How AI Reduced Manual Driver Verification by 75% — Operations Case Study. Part 2
4h ago