> ## Documentation Index
> Fetch the complete documentation index at: https://docs.lyzr.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Simulation Engine

> Test your agent against synthetic conversations before deploying to real users.

The Simulation Engine lets you test your agent against synthetic conversations before deploying to real users. It generates realistic test inputs based on your agent's role and goal, runs them against the live agent, and scores the results across quality and safety metrics.

## How it works

1. Open an agent and go to **Safety and Evaluations > Simulation Engine**.
2. Define or auto-generate **scenarios**: situations the agent should handle. Examples: "angry customer demanding a refund," "user asking an out-of-scope question."
3. Define or auto-generate **personas**: user types the agent will encounter. Examples: "non-technical user," "enterprise decision-maker," "hostile adversarial user."
4. The engine combines scenarios and personas into test cases automatically.
5. Run the simulation. The engine executes each test case and scores the results.

## Scoring metrics

| Metric          | What it measures                                              |
| --------------- | ------------------------------------------------------------- |
| Task Completion | Did the agent accomplish what the user asked?                 |
| Hallucination   | Did the agent fabricate facts not present in its knowledge?   |
| Faithfulness    | Is the response grounded in the connected Knowledge Base?     |
| Toxicity        | Did the agent produce harmful content?                        |
| Bias            | Did the agent treat any group unfairly?                       |
| Tool Accuracy   | Did the agent call the right tool with the correct arguments? |

## Agent Hardening

When test cases fail, select them and choose **Agent Hardening**. The engine analyzes the failure patterns and recommends changes to the agent's instructions, model selection, or feature configuration (for example, enabling Reflection for an agent that is hallucinating).

Review the recommendations, apply them to the agent, and re-run the simulation to confirm improvement.

## Before going to production

Run the Simulation Engine until the agent meets your quality bar. A reasonable threshold for most production agents is 90% or higher task completion, zero toxicity failures, and a hallucination rate below your acceptable limit with all tool calls producing correct outputs.

The Simulation Engine is the primary quality gate before promoting any agent to a production environment.

## Next steps

* [Approval Flows](../governance/approval-flows)
* [Tracing](tracing)
