Simulation

Generate synthetic conversations with persona-based testing — stress-test your AI agents at scale before shipping to production.

Why Use Simulation?

Real user data is limited, slow to collect, and often lacks edge-case coverage. The Simulation page lets you generate diverse, controlled test conversations by defining personas and scenarios, then running them against your agents in batch.

👥 Persona-Based Testing

Define user personas with traits, expertise levels, and communication styles to simulate realistic diversity.

💬 Synthetic Conversations

Generate multi-turn conversations grounded in persona profiles and scenario templates.

🤖 Agent Integration

Connect directly to your AI agents and run automated simulation tests end-to-end.

⚡ Batch Evaluation

Run hundreds of simulations in parallel, automatically evaluate quality, and export results.

Quick Start

Follow these four steps to run your first simulation:

Define Personas

Create one or more user personas with attributes like name, background, expertise level, and communication style. Personas drive the tone and content of generated conversations.

Configure Scenarios

Set up test scenarios with topic templates, difficulty levels, and expected behaviors. Each scenario defines the conversation context your personas will interact with.

Run Simulation

Launch the simulation batch. Monitor progress in real time as conversations are generated across all persona-scenario combinations.

Review & Export

Inspect generated conversations, review quality metrics, and export the results as CSV for use in the Evaluate page or external analysis tools.

💡 Tip

Start with 2–3 personas and a single scenario to validate your setup. You can scale up to dozens of personas and scenarios once you're comfortable with the output quality.

Page Anatomy

The Simulation page is organized into four workflow sections accessible via a vertical stepper or tab bar:

localhost:3500/simulation

Simulation

Generate synthetic conversations with persona-based testing

1. Personas

2. Scenarios

3. Run

4. Results

Jane Doe

Senior Engineer

Expert Technical Concise

Alex Smith

New Customer

Beginner Casual Verbose

+ Add Persona

The Simulation page showing the workflow stepper, persona cards with attribute badges, and the add-persona button.

Page Header — Title, icon, and subtitle. Consistent with all AXIS pages.

Workflow Stepper — Four steps: Personas, Scenarios, Run, and Results. The active step is highlighted with the primary color.

Persona Cards — Each persona is displayed as a card with an avatar, name, role, and attribute badges (expertise, style, verbosity).

Persona Configuration

Personas define who is interacting with your AI agent. Each persona is a set of attributes that shape the generated conversation's tone, vocabulary, and complexity.

Persona Attributes

Attribute	Description	Example Values
Name	Display name for identification	Jane Doe, Alex Smith
Age	Simulated age bracket	25, 42, 68
Background	Professional or personal context	Senior Engineer, New Customer
Expertise Level	Familiarity with the product domain	Beginner Intermediate Expert
Communication Style	How the persona phrases questions	Technical, Casual, Formal, Verbose, Concise

ℹ️ Info

Personas are saved locally and reusable across simulation runs. Click any persona card to edit its attributes, or use the + Add Persona button to create a new one.

Effective Persona Design

Cover the spectrum — Include beginners, intermediates, and experts to test different response styles
Vary communication styles — A terse, technical user exercises different code paths than a verbose, casual one
Add adversarial personas — Create a persona that asks ambiguous or off-topic questions to test guardrails
Match your user base — Model personas after real customer segments for relevant test coverage

Scenario Setup

Scenarios define what each persona will ask about. They provide the topic, context, difficulty, and expected agent behaviors.

Scenario Template Fields

Field	Description	Example
Topic	The subject area of the conversation	Return policy, Account setup
Difficulty	Complexity level for the scenario	Easy Medium Hard
Context / Prompt	Additional instructions or constraints	"User is frustrated after 3 failed attempts"
Expected Behaviors	What the agent should do or avoid	Offer escalation, Avoid jargon
Max Turns	Conversation length limit	5, 10, 20

Scenario Configuration

Return Policy Inquiry

Easy

User asks about returning an electronics item purchased online within the last 30 days.

Max turns: 5 · Expected: Provide clear policy, offer receipt lookup

Billing Dispute

Hard

Frustrated user disputes a charge after 3 failed support attempts. Requires empathy and escalation.

Max turns: 10 · Expected: Acknowledge frustration, escalate to human

Account Setup

Medium

New user needs help creating an account and connecting a payment method.

Max turns: 8 · Expected: Step-by-step guidance, verify completion

+ Add Scenario

Scenario configuration cards showing topic, difficulty badge, description, expected behaviors, and max turn count.

💡 Tip

Combine each persona with every scenario to maximize coverage. For example, 3 personas across 4 scenarios yields 12 simulated conversations, each with a unique user/topic combination.

Running Simulations

Once personas and scenarios are configured, launch the simulation. The Run step shows real-time progress as conversations are generated.

Progress Monitoring

Simulation in Progress

Simulation Progress

8 of 12 conversations complete

Estimated time remaining: ~2 min

Total

Complete

In Progress

Queued

Persona	Scenario	Turns	Status
Jane Doe	Return Policy	5	Complete
Jane Doe	Billing Dispute	8	Complete
Alex Smith	Return Policy	4	Complete
Alex Smith	Account Setup	6	Running
Pat Lee	Billing Dispute	—	Queued

Real-time simulation progress showing overall completion bar, batch KPIs, and per-conversation status table.

Key behaviors during a run:

Parallel execution — Multiple conversations run concurrently for faster batch completion
Live status — Each row updates in real time as conversations progress through turns
Error handling — Failed conversations are marked in red and can be retried individually
Cancel support — Stop the batch at any time; completed conversations are preserved

⚠️ Warning

Large batches (50+ conversations) may take several minutes depending on your agent's response latency. Monitor the estimated time remaining and adjust batch size if needed.

Results Review

After the simulation completes, the Results step displays generated conversations alongside quality metrics.

Overview Metrics

A KPI strip at the top summarizes the simulation run:

Simulation Results

Conversations

Avg. Quality

0.84

Avg. Turns

6.2

Issues Found

Persona	Scenario	Turns	Quality	Issues
Jane Doe	Return Policy	5	0.92	—	View →
Jane Doe	Billing Dispute	8	0.71	Missed escalation	View →
Alex Smith	Return Policy	4	0.89	—	View →
Alex Smith	Account Setup	6	0.87	—	View →
Pat Lee	Billing Dispute	10	0.52	Tone mismatch	View →

Results overview with quality KPIs and a conversation table showing per-conversation quality scores and detected issues.

Conversation Detail View

Click View on any row to open the full conversation transcript. The detail view shows:

Message timeline — Alternating user (persona) and agent messages with timestamps
Per-turn scores — Quality and relevance scores for each agent response
Issue annotations — Flagged turns are highlighted with inline explanations
Persona context — A sidebar reminds you of the persona's attributes and the scenario template

ℹ️ Info

Low-scoring conversations are the most valuable output of simulation. Focus your review on red and yellow results to identify agent weaknesses before real users encounter them.

Export & Integration

Simulation results can be exported and fed into other AXIS pages for deeper analysis.

Export Options

Format	Contents	Use Case
CSV	Flattened rows with persona, scenario, turns, scores, and full transcript	Upload to the Evaluate page for scoring with additional metrics
JSON	Structured conversation objects with metadata	Programmatic analysis, CI/CD pipeline integration

Integration with Other Pages

Evaluate — Export simulation CSV and upload it as an evaluation dataset. Run LLM-as-Judge or automated metrics on the generated conversations.
Monitoring — Use simulation results as baseline data to compare against production conversation quality.
Calibration — Feed simulation outputs into the Calibration Studio to test judge agreement on synthetic data before applying to production.

💡 Tip

A powerful workflow: run simulations before each release, export to Evaluate, compare scores against the previous release's simulation, and use the delta to decide whether to ship.

Simulation

Why Use Simulation?

👥 Persona-Based Testing

💬 Synthetic Conversations

🤖 Agent Integration

⚡ Batch Evaluation

Quick Start

Define Personas

Configure Scenarios

Run Simulation

Review & Export

Page Anatomy

Simulation

Persona Configuration

Persona Attributes

Effective Persona Design

Scenario Setup

Scenario Template Fields

Running Simulations

Progress Monitoring

Results Review

Overview Metrics

Conversation Detail View

Export & Integration

Export Options

Integration with Other Pages

Next Steps

📊 Evaluate →

🔍 Calibration →

📈 Monitoring →