Observability

Control Plane’s observability layer gives you full visibility into what your agents do, how long it takes, and what it costs for every run, in production and in CI.

Trace collection

Traces are collected via the Control Plane SDK and routed through the Control Plane Collector (an OpenTelemetry Collector configured for agent workloads).

Automatic instrumentation

When you call langship.init(), Control Plane patches the following libraries automatically:

Library	What’s traced
`openai`	Every chat completion and embedding call
`anthropic`	Every message call
`langchain`	Agent loops, chains, tools, retrievers, memory
`llama_index`	Query engines, retrievers, LLM calls
`cohere`	Chat and embed calls

No code changes needed beyond the init() call.

Manual spans

For code that Control Plane doesn’t auto-instrument, add spans manually:

with langship.span("my-custom-step", input={"query": query}) as span:
    result = my_custom_function(query)
    span.set_output(result)

Spans support arbitrary attributes:

span.set_attribute("customer_id", customer_id)
span.set_attribute("model_version", "v2.1")

Trace viewer

The trace viewer shows the full span tree for any run:

Timeline view: spans laid out on a horizontal time axis; see overlapping calls
Tree view: hierarchical view of parent/child spans
Detail panel: click any span to see full input, output, model metadata, token counts, and cost

Filtering and search

Filter runs by:

Project and environment
Date range
Status (success / failure / error)
Tag or attribute value
Latency (runs slower than N ms)
Cost (runs above $X)

Search full-text across all span inputs and outputs.

Metrics

Control Plane aggregates per-project metrics over time:

Metric	Description
P50 / P95 / P99 latency	Response time percentiles per run
Token usage	Input and output tokens, broken down by model
Cost	Estimated cost per run and per day (based on model pricing)
Error rate	Percentage of runs that errored
Tool call rate	Average number of tool calls per run
Eval pass rate	Percentage of runs that passed each evaluator

Metrics are available in the dashboard and via the API for export to Grafana or other dashboards.

Alerting

Set up alerts on any metric:

Go to your project → Alerts → New Alert
Choose a metric (e.g., P95 latency, error rate, cost per day)
Set a threshold and a notification channel (email, Slack webhook, PagerDuty)

Alerts evaluate every 5 minutes.

Exporting to external backends

Control Plane’s collector can forward traces to any OTLP-compatible backend in parallel with Control Plane’s own storage. In collector-config.yaml (included in the Docker Compose setup):

exporters:
  jaeger:
    endpoint: "jaeger:14250"
    tls:
      insecure: true
  grafana:
    endpoint: "https://otlp-gateway-prod-us-central-0.grafana.net/otlp"
    headers:
      authorization: "Basic ${GRAFANA_TOKEN}"

service:
  pipelines:
    traces:
      exporters: [langship, jaeger, grafana]

Session replay

Control Plane records the full conversation history for multi-turn agents. In the run detail view, click Session Replay to step through the conversation turn by turn, with the full trace visible for each turn.

Cost tracking

Control Plane estimates cost for every LLM call using current model pricing. Costs are visible:

Per span (each LLM call)
Per run (total)
Per project per day (aggregate)
In eval results (how much did this eval suite cost to run?)

Set a cost budget alert to notify you when daily costs exceed a threshold.

SDK API reference

# Initialize
langship.init(
    endpoint="https://langship.yourcompany.com",
    api_key="...",
    project="my-agent",
    environment="production",   # default: "default"
    sample_rate=1.0,            # trace 100% of runs; reduce for high-volume agents
)

# Manual span
with langship.span("step-name", input={...}) as span:
    output = do_work()
    span.set_output(output)
    span.set_attribute("key", "value")

# Log an event on the current span
langship.log("retrieved 3 documents", level="info")

# Tag the current run
langship.tag("customer_tier", "enterprise")

# Flush before exit (important in short-lived scripts)
langship.flush()

Getting Started

Guides

Observability

Trace collection

Automatic instrumentation

Manual spans

Trace viewer

Filtering and search

Metrics

Alerting

Exporting to external backends

Session replay

Cost tracking

SDK API reference

​Trace collection

​Automatic instrumentation

​Manual spans

​Trace viewer

​Filtering and search

​Metrics

​Alerting

​Exporting to external backends

​Session replay

​Cost tracking

​SDK API reference

Trace collection

Automatic instrumentation

Manual spans

Trace viewer

Filtering and search

Metrics

Alerting

Exporting to external backends

Session replay

Cost tracking

SDK API reference