Skip to main content
A full-featured browser interface served at http://localhost:3333 — chat, skills, integrations, and voice all in one place.

Starting the Server

# No auth — open to anyone on the network
gitagent --voice --dir ~/assistant
# opens http://localhost:3333

# With password protection
GITAGENT_PASSWORD=mysecret gitagent --voice --dir ~/assistant

# Custom username (defaults to "admin")
GITAGENT_USERNAME=alice GITAGENT_PASSWORD=mysecret gitagent --voice --dir ~/assistant
Auth behaviour
  • Port is always 3333 — no env var to change it
  • All HTTP routes show a login page when GITAGENT_PASSWORD is set
  • WebSocket connections are rejected without a valid auth cookie
  • /health always stays open (for load balancers)
  • Cookie: HttpOnly, SameSite=Strict, 24-hour expiry, SHA-256 token

Interface Tabs

TabDescription
ChatReal-time conversation, voice controls, camera, file system viewer
SkillsBrowse and install skills from the marketplace
IntegrationsConnect Composio services (Gmail, Calendar, Slack, GitHub)
CommunicationTelegram bot setup, WhatsApp connection, phone/SMS webhook
SkillFlowsVisual workflow builder — chain skills into multi-step flows
SchedulerCreate cron jobs — run prompts on a schedule
SettingsModel selection, API keys, custom base URL — saves to .env and agent.yaml

Voice Mode

OpenAI Realtime (default)

  • Model: gpt-realtime-2025-08-28
  • Real-time audio streaming over WebSocket
  • Supports image input (camera frames)
  • Requires: OPENAI_API_KEY
OPENAI_API_KEY=your_key gitagent --voice --dir ~/assistant

Gemini Live (free tier)

  • Model: models/gemini-2.5-flash-native-audio-preview
  • Alternative voice provider
  • Free tier available
  • Requires: GEMINI_API_KEY
GEMINI_API_KEY=your_key gitagent --voice gemini --dir ~/assistant

Camera Input

  • Front/back camera toggle (mobile)
  • Captures frames every 2 seconds as JPEG
  • Frames injected into conversation as images
  • Auto-captures on “memorable moments” (laughter, excitement)

Text-Only Fallback

No voice API key? GitAgent still starts the web UI server but with voice disabled. Text input routes directly to the agent via query(). Web UI runs at http://localhost:3333.

Personal Assistant Quick Start

Install GitAgent and run your first session

CLI Reference

All flags, REPL commands, and the plugin CLI

Messaging

Connect Telegram, WhatsApp, and phone

Integrations

Connect Composio services like Gmail, Slack, and GitHub