Skip to main content
GET
/
config
/
pipeline-options
List Pipeline Options
curl --request GET \
  --url https://voice-livekit.studio.lyzr.ai/v1/config/pipeline-options \
  --header 'x-api-key: <api-key>'
{
  "stt": [
    {
      "providerId": "deepgram",
      "displayName": "Deepgram",
      "models": [
        {
          "id": "deepgram/nova-3:en",
          "name": "Deepgram Nova-3 (EN)",
          "languages": [
            "<string>"
          ]
        }
      ]
    }
  ],
  "tts": [
    {
      "providerId": "elevenlabs",
      "displayName": "ElevenLabs",
      "models": [
        {
          "id": "elevenlabs/eleven_turbo_v2_5",
          "name": "ElevenLabs Eleven Turbo v2.5",
          "defaultVoiceId": "Xb7hH8MSUJpSbSDYk0k2",
          "languages": [
            "<string>"
          ]
        }
      ]
    }
  ],
  "llm": [
    {
      "providerId": "openai",
      "displayName": "OpenAI",
      "models": [
        {
          "id": "openai/gpt-4o",
          "name": "GPT-4o"
        }
      ]
    }
  ]
}
Discover the complete catalog of models available for building a modular Voice Agent pipeline. While “Realtime” models (like gpt-realtime or gemini-native-audio) handle listening, thinking, and speaking in one step, a Pipeline agent strings together three separate specialized models. This endpoint gives you the exact IDs needed to configure that 3-step chain.
Authentication Required: You must include your API key in the x-api-key header to authenticate this request.

Understanding the Pipeline Architecture

To create a pipeline agent, you must configure three components in your POST /agents request. The response from this endpoint provides the available options for each:

1. stt (Speech-to-Text)

This is the model that transcribes the user’s spoken audio into text.
  • Available Providers: Deepgram, AssemblyAI, Cartesia, Sarvam.
  • Key Detail: Pay close attention to the languages array for each STT model to ensure it supports your target demographic.

2. llm (Large Language Model)

This is the “brain” of the agent. It takes the text from the STT model, processes it against your system prompt, and generates a text response.
  • Available Providers: OpenAI (GPT-4o, GPT-5 series), Google (Gemini Flash/Pro series), DeepSeek, MoonshotAI.

3. tts (Text-to-Speech)

This model takes the text generated by the LLM and synthesizes it into spoken audio for the user to hear.
  • Available Providers: ElevenLabs, Cartesia, Deepgram (Aura), Inworld, Rime, Sarvam.
  • Key Detail: The response includes a defaultVoiceId for each TTS model, which you can use as a fallback if you aren’t rendering a full list of custom voice clones.

Authorizations

x-api-key
string
header
required

Response

200 - application/json

Pipeline options retrieved successfully.

stt
object[]

Supported Speech-to-Text (transcription) providers and models.

tts
object[]

Supported Text-to-Speech (voice generation) providers and models.

llm
object[]

Supported text-based Large Language Models used for reasoning.