> ## Documentation Index > Fetch the complete documentation index at: https://docs.lyzr.ai/llms.txt > Use this file to discover all available pages before exploring further. # Web & Voice > Run GitAgent's full browser interface with real-time voice, camera input, and a text-only fallback. A full-featured browser interface served at `http://localhost:3333` — chat, skills, integrations, and voice all in one place. ## Starting the Server ```bash theme={null} # No auth — open to anyone on the network gitagent --voice --dir ~/assistant # opens http://localhost:3333 # With password protection GITAGENT_PASSWORD=mysecret gitagent --voice --dir ~/assistant # Custom username (defaults to "admin") GITAGENT_USERNAME=alice GITAGENT_PASSWORD=mysecret gitagent --voice --dir ~/assistant ``` **Auth behaviour** * Port is always 3333 — no env var to change it * All HTTP routes show a login page when GITAGENT\_PASSWORD is set * WebSocket connections are rejected without a valid auth cookie * /health always stays open (for load balancers) * Cookie: HttpOnly, SameSite=Strict, 24-hour expiry, SHA-256 token ## Interface Tabs | Tab | Description | | ------------- | ------------------------------------------------------------------------- | | Chat | Real-time conversation, voice controls, camera, file system viewer | | Skills | Browse and install skills from the marketplace | | Integrations | Connect Composio services (Gmail, Calendar, Slack, GitHub) | | Communication | Telegram bot setup, WhatsApp connection, phone/SMS webhook | | SkillFlows | Visual workflow builder — chain skills into multi-step flows | | Scheduler | Create cron jobs — run prompts on a schedule | | Settings | Model selection, API keys, custom base URL — saves to .env and agent.yaml | ## Voice Mode ### OpenAI Realtime (default) * Model: `gpt-realtime-2025-08-28` * Real-time audio streaming over WebSocket * Supports image input (camera frames) * Requires: `OPENAI_API_KEY` ```bash theme={null} OPENAI_API_KEY=your_key gitagent --voice --dir ~/assistant ``` ### Gemini Live (free tier) * Model: `models/gemini-2.5-flash-native-audio-preview` * Alternative voice provider * Free tier available * Requires: `GEMINI_API_KEY` ```bash theme={null} GEMINI_API_KEY=your_key gitagent --voice gemini --dir ~/assistant ``` ## Camera Input * Front/back camera toggle (mobile) * Captures frames every 2 seconds as JPEG * Frames injected into conversation as images * Auto-captures on "memorable moments" (laughter, excitement) ## Text-Only Fallback No voice API key? GitAgent still starts the web UI server but with voice disabled. Text input routes directly to the agent via `query()`. Web UI runs at `http://localhost:3333`. Install GitAgent and run your first session All flags, REPL commands, and the plugin CLI Connect Telegram, WhatsApp, and phone Connect Composio services like Gmail, Slack, and GitHub