http://localhost:3333 — chat, skills, integrations, and voice all in one place.
Starting the Server
- Port is always 3333 — no env var to change it
- All HTTP routes show a login page when GITAGENT_PASSWORD is set
- WebSocket connections are rejected without a valid auth cookie
- /health always stays open (for load balancers)
- Cookie: HttpOnly, SameSite=Strict, 24-hour expiry, SHA-256 token
Interface Tabs
| Tab | Description |
|---|---|
| Chat | Real-time conversation, voice controls, camera, file system viewer |
| Skills | Browse and install skills from the marketplace |
| Integrations | Connect Composio services (Gmail, Calendar, Slack, GitHub) |
| Communication | Telegram bot setup, WhatsApp connection, phone/SMS webhook |
| SkillFlows | Visual workflow builder — chain skills into multi-step flows |
| Scheduler | Create cron jobs — run prompts on a schedule |
| Settings | Model selection, API keys, custom base URL — saves to .env and agent.yaml |
Voice Mode
OpenAI Realtime (default)
- Model:
gpt-realtime-2025-08-28 - Real-time audio streaming over WebSocket
- Supports image input (camera frames)
- Requires:
OPENAI_API_KEY
Gemini Live (free tier)
- Model:
models/gemini-2.5-flash-native-audio-preview - Alternative voice provider
- Free tier available
- Requires:
GEMINI_API_KEY
Camera Input
- Front/back camera toggle (mobile)
- Captures frames every 2 seconds as JPEG
- Frames injected into conversation as images
- Auto-captures on “memorable moments” (laughter, excitement)
Text-Only Fallback
No voice API key? GitAgent still starts the web UI server but with voice disabled. Text input routes directly to the agent via
query(). Web UI runs at http://localhost:3333.Personal Assistant Quick Start
Install GitAgent and run your first session
CLI Reference
All flags, REPL commands, and the plugin CLI
Messaging
Connect Telegram, WhatsApp, and phone
Integrations
Connect Composio services like Gmail, Slack, and GitHub