Skip to main content
The Image as Output feature allows your custom Agent to generate images directly in response to a user’s prompt. This is especially powerful for Agents specializing in visual content, such as architectural visualization, graphic design, or creative art generation.

1. Enabling Image as Output

To enable this feature, navigate to the Core Features section while creating or editing your Agent and toggle the switch for Image as Output.
  • Go to Core Features: Locate the Image as Output toggle switch.
  • Enable the Feature: Switch the toggle ON.

2. Configuring the Image Provider

Once enabled, you must configure the specific image generation model your Agent will use. Click the Configure link (as shown in the image above) to open the configuration panel.

Select Image Provider

The platform allows you to choose from various integrated image generation services.

Google Image Models

If you select Google, you will be presented with the available Gemini image models, such as:
  • gemini-3-pro-image-preview
  • gemini-2.5-flash-image

OpenAI Image Models

If you select OpenAI, you will have access to their DALL-E models, such as:
  • gpt-image-1
  • dall-e-3
  • dall-e-2
After selecting your preferred provider and model, click the Save button.

3. Agent Configuration Example

The screenshots illustrate a “Dream House Image Generator” Agent configured for image output:
FieldConfigurationPurpose
NameDream House Image GeneratorDescriptive Agent name.
ModelOpenAI / gpt-4tThe large language model driving the Agent’s reasoning.
Agent RoleExpert Architectural Visualization Assistant…Defines the Agent’s persona and expertise.
Agent GoalProduce accurate, high-quality images…Clear objective for the Agent’s output.
Agent InstructionsOverview: You generate house images on demand…Detailed instructions on expected output (front view, interior, landscaping), format (PNG/JPEG), and supporting variations.
Core FeatureImage as Output (Enabled)Activates the image generation capability.

4. Testing the Agent Inference

Once configured, you can test the Agent’s ability to generate images. The Agent processes the user’s text prompt and uses the selected image provider to generate the visual output. In the example below, the user requests a specific type of house:
“I want the front view of a modern french style palace. I want it to be in shaded of black and wooden color. It needs to be photo realistic one.”
The Agent responds by:
  • Generating 1 JPEG image.
  • Providing a detailed Prompt used for the image generation (which is based on the user’s input and refined by the Agent’s instructions).
  • Offering a link to the generated image.
  • Suggesting follow-up actions to the user (e.g., interior views).

Example Output

The resulting image generated by the Agent based on the user’s prompt is a photorealistic front view of a black and white modern French-style palace: This feature empowers developers to build Agents capable of delivering complex, custom visual content directly.