Skip to main content
The langship CLI is the primary interface for running evals, managing datasets, and triggering deployments locally and in CI.

Installation

pip install langship
Verify:
langship --version
# langship 0.5.0

Authentication

langship login
# Opens browser for OAuth, or:
langship login --api-key your-api-key
Set the server URL for self-hosted deployments:
langship config set endpoint https://langship.yourcompany.com

Commands

langship eval

Run evaluators against your agent.
langship eval run [OPTIONS]
Options:
FlagDescription
--config PATHPath to langship.yaml (default: ./langship.yaml)
--eval NAMERun only the named evaluator (repeat for multiple)
--dataset NAMEOverride the dataset specified in config
--dataset-version SHAPin to a specific dataset version
--env ENVIRONMENTTarget environment (default: default)
--no-failAlways exit 0 regardless of results
--output FORMATOutput format: table (default), json, junit
--parallel NRun N test cases in parallel (default: 5)
Examples:
# Run all evals
langship eval run

# Run a specific evaluator
langship eval run --eval factual-accuracy

# Output JUnit XML for CI integration
langship eval run --output junit > results.xml

# Run against a pinned dataset version
langship eval run --dataset-version sha256:abc123

langship dataset

Manage versioned datasets.
# List datasets in the project
langship dataset list

# Push a local JSONL file as a new dataset version
langship dataset push golden-set ./evals/golden-set.jsonl

# Show a specific dataset version
langship dataset show golden-set --version latest

# Pull a dataset version to a local file
langship dataset pull golden-set --version sha256:abc123 --output ./local-copy.jsonl

# Diff two versions
langship dataset diff golden-set sha256:abc123 sha256:def456

langship runs

Browse and inspect past runs.
# List recent runs
langship runs list

# List runs for a specific environment
langship runs list --env production

# Show full details for a run
langship runs show RUN_ID

# Show the trace for a run
langship runs trace RUN_ID

# Export a run's trace as OTLP JSON
langship runs export RUN_ID --format otlp

langship deploy

Deploy an agent version to an environment.
langship deploy [OPTIONS]
Options:
FlagDescription
--env ENVIRONMENTTarget environment (e.g., staging, production)
--agent-id IDAgent ID in the target platform
--target PLATFORMDeployment target: lyzr (default)
--dry-runValidate config without deploying
--waitWait for deployment to complete
Examples:
# Deploy to staging
langship deploy --env staging --agent-id abc123

# Dry run to validate
langship deploy --env production --dry-run

langship config

Manage CLI configuration.
# Show current config
langship config show

# Set a config value
langship config set endpoint https://langship.yourcompany.com
langship config set project my-agent
langship config set default_env staging

# Unset a config value
langship config unset default_env
Config is stored in ~/.langship/config.yaml.

langship init

Initialize a new langship.yaml in the current directory.
langship init
Interactive prompts will ask for your project name, default dataset path, and first evaluator type.

langship server

Start or stop a local Control Plane Server (development use).
# Start local server with Docker
langship server start

# Stop local server
langship server stop

# Show server status
langship server status

# View server logs
langship server logs

Global flags

FlagDescription
--profile NAMEUse a named config profile
--verboseShow verbose output
--jsonOutput raw JSON
--no-colorDisable colored output
--helpShow help for any command

Environment variables

VariableDescription
LANGSHIP_ENDPOINTServer URL (overrides config)
LANGSHIP_API_KEYAPI key (overrides config)
LANGSHIP_PROJECTDefault project (overrides config)
LANGSHIP_ENVDefault environment (overrides config)
LANGSHIP_NO_COLORDisable colored output