The Harness

Tally Routing, Live

A synthetic workload generator that drives realistic AI tasks through Tally's routing engine. Watch the bandit learn in real time.

🔒
Quick check

Confirm you're human

The live demo runs real routing logic. Solve this to continue.

What is

No account needed. The demo runs entirely in your browser.

Simulation Controls
This demo runs a local simulation — no real API keys required.
Scenario mix:
Waiting to start — click Run
Click ▶ Run to start the harness

Total Events

0
routing decisions made

Cost Saved

$0.00
vs always-sonnet baseline

Exploration Rate

of calls are still learning

Model Distribution

No data yet
🎲

Exploration events

When the bandit's confidence for a task shape is below threshold, it picks a non-greedy model to gather more signal. These are marked explore.

Exploitation events

When confidence is high, Tally exploits what it knows — routing to the cheapest model that meets the quality bar. These are marked exploit.

📈

Savings accumulate

Every exploitation event where a cheaper model is chosen saves real money. The saved amount compounds — over millions of calls, this is significant.

Connect your real workload

The harness is a simulation. Your real savings come from running Tally on your actual AI calls.