Voice Agent

🎙️ Introduction to Voice Agent in Lyzr

Lyzr’s Voice Agent feature allows you to build intelligent agents that can understand spoken queries and respond through high-quality, lifelike voice output — all without writing a single line of code.

These agents bring true conversational AI to your applications, combining speech interfaces with the reasoning power of large language models (LLMs). Whether you’re building for customer support, field operations, accessibility, or hands-free automation — voice is the most intuitive interface.

🧠 What Powers the Voice Agent?

Lyzr integrates best-in-class services for both speech recognition and generation:

🗣️ Deepgram – Speech-to-Text (STT)

Converts spoken language into accurate, real-time text.
Handles natural variations, accents, and noise robustly.
Used to transcribe your voice input into query-ready text.

🔊 ElevenLabs – Text-to-Speech (TTS)

Converts the agent’s reply into human-like voice output.
Offers multiple voices with emotional tone and inflection.
Used to speak the agent’s reply back to the user.

Together, these create a seamless audio pipeline — from voice in → AI thinking → voice out.

🔍 Why Use Voice Agents?

Voice is the most natural, fast, and accessible way for humans to communicate. With Lyzr Voice Agents, you unlock:

Hands-Free Interaction: Ideal for mobile, industrial, or kiosk-based environments.
Accessibility: Empower users who prefer or require voice over text.
Conversational Interfaces: Build assistants that feel alive and responsive.
Faster Workflows: Speak tasks instead of typing them.

🎯 Ideal Use Cases

Use Case	Description
AI Helpdesk Agent	Let users ask support questions by voice
AI Receptionist	Greet users, route requests, or capture basic details in spoken form
Operational Assistant	Workers in the field or factory can issue spoken commands to agents
Voice-Enabled Kiosk	Add conversational capabilities to physical spaces (museums, banks, etc.)

🔒 Voice Data Handling

Audio is processed securely via API calls to Deepgram and ElevenLabs.
No raw voice recordings are stored unless explicitly enabled.
All voice operations happen live — no persistent storage by default.

Summary

Component	Technology	Role
Voice Input	Microphone	Captures user query
STT Engine	Deepgram	Converts speech to text
LLM Engine	GPT/Claude	Understands and responds
TTS Engine	ElevenLabs	Converts response to speech
Voice Output	Audio Playback	Speaks back to the user

Lyzr Voice Agents enable you to build powerful, real-time conversational systems — voice-first, human-like, and AI-powered. With the best of Deepgram and ElevenLabs under the hood, it’s never been easier to bring speech interfaces to life.

Deep Dive

Build Paths

Agents

Voice Agent

Classic Knowledge Base

Semantic Data Model

Knowledge Graph

Memory

Tools

Responsible & Safe AI

Manager Agent

DAG

Orchestration

Lyzr Agents as MCP Servers

Voice Agent

🎙️ Introduction to Voice Agent in Lyzr

🧠 What Powers the Voice Agent?

🗣️ Deepgram – Speech-to-Text (STT)

🔊 ElevenLabs – Text-to-Speech (TTS)

🔍 Why Use Voice Agents?

🎯 Ideal Use Cases

🔒 Voice Data Handling

Summary

Deep Dive

Build Paths

Agents

Voice Agent

Classic Knowledge Base

Semantic Data Model

Knowledge Graph

Memory

Tools

Responsible & Safe AI

Manager Agent

DAG

Orchestration

Lyzr Agents as MCP Servers

​🎙️ Introduction to Voice Agent in Lyzr

​🧠 What Powers the Voice Agent?

​🗣️ Deepgram – Speech-to-Text (STT)

​🔊 ElevenLabs – Text-to-Speech (TTS)

​🔍 Why Use Voice Agents?

​🎯 Ideal Use Cases

​🔒 Voice Data Handling

​Summary

🎙️ Introduction to Voice Agent in Lyzr

🧠 What Powers the Voice Agent?

🗣️ Deepgram – Speech-to-Text (STT)

🔊 ElevenLabs – Text-to-Speech (TTS)

🔍 Why Use Voice Agents?

🎯 Ideal Use Cases

🔒 Voice Data Handling

Summary