Retrieve the complete list of supported providers and models for Speech-to-Text (STT), Large Language Models (LLM), and Text-to-Speech (TTS) pipelines.
gpt-realtime or gemini-native-audio) handle listening, thinking, and speaking in one step, a Pipeline agent strings together three separate specialized models. This endpoint gives you the exact IDs needed to configure that 3-step chain.
x-api-key header to authenticate this request.POST /agents request. The response from this endpoint provides the available options for each:
stt (Speech-to-Text)languages array for each STT model to ensure it supports your target demographic.llm (Large Language Model)tts (Text-to-Speech)defaultVoiceId for each TTS model, which you can use as a fallback if you aren’t rendering a full list of custom voice clones.