> ## Documentation Index
> Fetch the complete documentation index at: https://docs.lyzr.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Agent Simulation Engine

> Test, evaluate, and harden AI agents through automated simulation and reinforcement learning loops, available as a standalone SDK and API.

The Agent Simulation Engine (A-Sim) lets you test and improve AI agents before they reach production. It generates synthetic conversations from persona and scenario combinations, evaluates agent responses across accuracy, helpfulness, and safety metrics, and automatically rewrites agent instructions based on the failures it finds.

Use A-Sim when you need confidence that your agent handles real-world edge cases, adversarial users, and domain-specific compliance requirements before deployment.

## The world model

A-Sim structures testing around a world model made up of two dimensions: **personas** and **scenarios**.

A **persona** is a user archetype that defines who is interacting with the agent. Examples include a first-time user unfamiliar with the product, an experienced power user with technical knowledge, or an adversarial user trying to bypass the agent's guardrails.

A **scenario** is a task type that defines what the user is trying to accomplish. Examples include a basic policy inquiry, a complex compliance issue, or a request the agent is supposed to refuse.

A-Sim combines every persona with every scenario to produce a set of **simulations**, which are synthetic test conversations. This cross-product approach ensures the agent is tested across the full range of situations it will encounter in production.

## How it works

1. You create an **environment**, which is an isolated clone of your agent used for safe evaluation without affecting the production version.
2. A-Sim generates personas and scenarios automatically using the agent's role and goal, or you define them manually.
3. A-Sim combines personas and scenarios into simulations and runs each one against the agent.
4. An **evaluation** scores each simulation response across the metrics you select.
5. Simulations that fail are passed to **agent hardening**, which analyzes the failure patterns and produces an improved set of agent instructions.
6. You start a new evaluation round with the improved instructions and repeat the cycle until all simulations pass.

## Evaluation metrics

Each evaluation run scores responses against one or more of the following metrics. Each simulation receives a final judgment of `PASS` or `FAIL`.

| Metric             | What it measures                                                  |
| ------------------ | ----------------------------------------------------------------- |
| `task_completion`  | Whether the agent accomplished what the user asked                |
| `hallucinations`   | Whether the agent fabricated facts not present in its knowledge   |
| `answer_relevancy` | Whether the response is on-topic and directly addresses the query |

## Agent hardening

When simulations fail, A-Sim analyzes the failure patterns across the evaluation round and produces two agent configurations: the original and an improved version with rewritten instructions targeting the specific failures. You can review the changes before applying them, or let A-Sim apply and re-evaluate automatically.

The hardening loop continues round by round until all simulations pass or you reach the maximum number of rounds you configure.
