Skip to main content
Control Plane’s observability layer gives you full visibility into what your agents do, how long it takes, and what it costs for every run, in production and in CI.

Trace collection

Traces are collected via the Control Plane SDK and routed through the Control Plane Collector (an OpenTelemetry Collector configured for agent workloads).

Automatic instrumentation

When you call langship.init(), Control Plane patches the following libraries automatically:
LibraryWhat’s traced
openaiEvery chat completion and embedding call
anthropicEvery message call
langchainAgent loops, chains, tools, retrievers, memory
llama_indexQuery engines, retrievers, LLM calls
cohereChat and embed calls
No code changes needed beyond the init() call.

Manual spans

For code that Control Plane doesn’t auto-instrument, add spans manually:
with langship.span("my-custom-step", input={"query": query}) as span:
    result = my_custom_function(query)
    span.set_output(result)
Spans support arbitrary attributes:
span.set_attribute("customer_id", customer_id)
span.set_attribute("model_version", "v2.1")

Trace viewer

The trace viewer shows the full span tree for any run:
  • Timeline view: spans laid out on a horizontal time axis; see overlapping calls
  • Tree view: hierarchical view of parent/child spans
  • Detail panel: click any span to see full input, output, model metadata, token counts, and cost
Filter runs by:
  • Project and environment
  • Date range
  • Status (success / failure / error)
  • Tag or attribute value
  • Latency (runs slower than N ms)
  • Cost (runs above $X)
Search full-text across all span inputs and outputs.

Metrics

Control Plane aggregates per-project metrics over time:
MetricDescription
P50 / P95 / P99 latencyResponse time percentiles per run
Token usageInput and output tokens, broken down by model
CostEstimated cost per run and per day (based on model pricing)
Error ratePercentage of runs that errored
Tool call rateAverage number of tool calls per run
Eval pass ratePercentage of runs that passed each evaluator
Metrics are available in the dashboard and via the API for export to Grafana or other dashboards.

Alerting

Set up alerts on any metric:
  1. Go to your project → AlertsNew Alert
  2. Choose a metric (e.g., P95 latency, error rate, cost per day)
  3. Set a threshold and a notification channel (email, Slack webhook, PagerDuty)
Alerts evaluate every 5 minutes.

Exporting to external backends

Control Plane’s collector can forward traces to any OTLP-compatible backend in parallel with Control Plane’s own storage. In collector-config.yaml (included in the Docker Compose setup):
exporters:
  jaeger:
    endpoint: "jaeger:14250"
    tls:
      insecure: true
  grafana:
    endpoint: "https://otlp-gateway-prod-us-central-0.grafana.net/otlp"
    headers:
      authorization: "Basic ${GRAFANA_TOKEN}"

service:
  pipelines:
    traces:
      exporters: [langship, jaeger, grafana]

Session replay

Control Plane records the full conversation history for multi-turn agents. In the run detail view, click Session Replay to step through the conversation turn by turn, with the full trace visible for each turn.

Cost tracking

Control Plane estimates cost for every LLM call using current model pricing. Costs are visible:
  • Per span (each LLM call)
  • Per run (total)
  • Per project per day (aggregate)
  • In eval results (how much did this eval suite cost to run?)
Set a cost budget alert to notify you when daily costs exceed a threshold.

SDK API reference

# Initialize
langship.init(
    endpoint="https://langship.yourcompany.com",
    api_key="...",
    project="my-agent",
    environment="production",   # default: "default"
    sample_rate=1.0,            # trace 100% of runs; reduce for high-volume agents
)

# Manual span
with langship.span("step-name", input={...}) as span:
    output = do_work()
    span.set_output(output)
    span.set_attribute("key", "value")

# Log an event on the current span
langship.log("retrieved 3 documents", level="info")

# Tag the current run
langship.tag("customer_tier", "enterprise")

# Flush before exit (important in short-lived scripts)
langship.flush()