Skip to content

AXIS

Axis

Agent X-Ray Interface & Statistics

AXIS gives AI teams full visibility into model quality — from evaluation to production monitoring — so they can ship better models faster. Built as the visualization layer for the AXION evaluation engine, AXIS turns raw evaluation data into actionable insights through interactive dashboards, human-in-the-loop workflows, and real-time observability.

Get started View on GitHub


Why AXIS?

Comprehensive

11 integrated modules covering the full AI evaluation lifecycle — from batch evaluation and scoring to production monitoring, calibration, annotation, and decision memory.

Configurable

YAML-driven theming, agent registries, and data source configuration. Swap databases, customize branding, and extend functionality without touching code.

Open & Extensible

Self-hosted, API-first architecture. FastAPI backend with auto-generated OpenAPI docs, Zustand stores for clean state management, and a modular component library.


Platform Overview

Evaluate

Upload evaluation data, run batch evaluations, and explore results through interactive tree visualizations, analytics dashboards with 8+ chart types, and AI-generated reports with structured cross-metric pattern insights.

Learn more

Production

Executive overview combining Agent KPIs, AI quality monitoring, and human feedback signals in a single dashboard with sparkline trends.

Learn more

Monitoring

Deep-dive production observability — time-series score trends, metric breakdowns, latency distributions, and anomaly alerts.

Learn more

Annotation

Human-in-the-loop quality assessment with 3 annotation formats, tag-based critiques, and CSV export.

Learn more

CaliberHQ

LLM judge calibration — 3-step workflow with annotation, EvidencePipeline-powered pattern discovery with learning insights, and alignment validation via Cohen's Kappa and confusion matrices.

Learn more

Simulation

Synthetic persona-based agent testing with configurable personas, knowledge base upload, and conversation replay.

Learn more

Memory

Decision memory dashboard with rule extraction, hard stops, batch analysis, and knowledge graph visualization.

Learn more

Human Signals

Data-driven HITL dashboard with dynamic KPI strips, signal trend charts, classification distributions, and case-level drill-down.

Learn more


Tech Stack

Layer Technologies
Frontend Next.js 14 (App Router), TypeScript, Tailwind CSS
State Zustand, TanStack React Query
Charts Plotly.js, D3.js
Backend FastAPI, Python 3.12
Data Pandas, NumPy, Scikit-Learn
Config Pydantic Settings, python-dotenv

Quick Start

# Clone and install
git clone https://github.com/ax-foundry/axis.git
cd axis
make install

# Start development servers
make dev

Full installation guide