1. Enabling Image as Output
To enable this feature, navigate to the Core Features section while creating or editing your Agent and toggle the switch for Image as Output.- Go to Core Features: Locate the Image as Output toggle switch.
- Enable the Feature: Switch the toggle ON.
2. Configuring the Image Provider
Once enabled, you must configure the specific image generation model your Agent will use. Click the Configure link (as shown in the image above) to open the configuration panel.Select Image Provider
The platform allows you to choose from various integrated image generation services.
Google Image Models
If you select Google, you will be presented with the available Gemini image models, such as:gemini-3-pro-image-previewgemini-2.5-flash-image
OpenAI Image Models
If you select OpenAI, you will have access to their DALL-E models, such as:gpt-image-1dall-e-3dall-e-2
After selecting your preferred provider and model, click the Save button.
3. Agent Configuration Example
The screenshots illustrate a “Dream House Image Generator” Agent configured for image output:| Field | Configuration | Purpose |
|---|---|---|
| Name | Dream House Image Generator | Descriptive Agent name. |
| Model | OpenAI / gpt-4t | The large language model driving the Agent’s reasoning. |
| Agent Role | Expert Architectural Visualization Assistant… | Defines the Agent’s persona and expertise. |
| Agent Goal | Produce accurate, high-quality images… | Clear objective for the Agent’s output. |
| Agent Instructions | Overview: You generate house images on demand… | Detailed instructions on expected output (front view, interior, landscaping), format (PNG/JPEG), and supporting variations. |
| Core Feature | Image as Output (Enabled) | Activates the image generation capability. |
4. Testing the Agent Inference
Once configured, you can test the Agent’s ability to generate images. The Agent processes the user’s text prompt and uses the selected image provider to generate the visual output. In the example below, the user requests a specific type of house:“I want the front view of a modern french style palace. I want it to be in shaded of black and wooden color. It needs to be photo realistic one.”The Agent responds by:
- Generating 1 JPEG image.
- Providing a detailed Prompt used for the image generation (which is based on the user’s input and refined by the Agent’s instructions).
- Offering a link to the generated image.
- Suggesting follow-up actions to the user (e.g., interior views).
Example Output
The resulting image generated by the Agent based on the user’s prompt is a photorealistic front view of a black and white modern French-style palace:
This feature empowers developers to build Agents capable of delivering complex, custom visual content directly.