Intelligence that grows
with every contribution.

Tally is a crowd-sourced platform. Every consumer and provider who connects makes the routing smarter — not just for themselves, but for everyone.

Join our Discord
Crowd-sourced intelligence

Tally gets smarter
every time someone uses it.

Most software ships with fixed logic and gets updated on a vendor's schedule. Tally works differently. The routing intelligence is crowd-sourced — built from the telemetry of every consumer who reports a call outcome and every provider who publishes capability signals.

When you report that a task of a certain shape performed well on a lightweight model, that signal propagates. The next person routing a similar shape benefits from it — even if they have never spoken to you and never will. This is how infrastructure behaves. Roads get better because millions of people drive on them. Tally gets better because millions of calls flow through it.

The more the community contributes, the smarter the platform becomes for everyone. Early adopters aren't just users — they are co-authors of the routing intelligence. Their production data shapes the decisions that help the next wave of teams get their routing right from day one.

📡

Consumers

Every call you route through Tally and every outcome you report feeds the shared bandit. Your task shapes train the model for everyone working in similar domains.

🔧

Providers

Every MCP server connected to Tally publishes capability signals and usage patterns. This data helps the platform understand tool costs, latencies, and success rates — and route accordingly.

📊

Shape data

Task shapes — clusters of similar call patterns — get richer with every integration. A shape that has seen ten thousand calls is far more reliably routed than one seen ten times.

🌱

Compounding value

The intelligence compounds. A platform with a thousand integrations is not ten times better than one with a hundred — it is orders of magnitude better, because the signal diversity is exponentially richer.

Open SDK

Transparent at
every step.

We ask you to trust Tally with the telemetry of your AI workloads. That is not a small ask. So we have made it possible to verify exactly what we do before you commit a single call.

The SDK is open source

Read every line before you run a single call.

The Tally SDK is fully open. No compiled blobs. No opaque network requests buried inside a library you import and trust blindly. Every function, every payload, every API call — readable, auditable, forkable.

  • 1 Before the call: you can read exactly what the SDK sends to Tally when requesting a routing recommendation — the task shape, the declared constraints, nothing else.
  • 2 During the call: your prompt and response go directly between your application and your LLM or MCP provider. Tally never sits in that path. Not a proxy. Not a middleman. We are a witness, not a relay.
  • 3 After the call: you can read exactly what the SDK reports back — the outcome shape, the model used, the latency. No content. No prompt text. No response data. Just the observable facts of the call.

We believe that if you cannot see what a system is doing, you should not trust it with your workloads. The open SDK is not a feature — it is a commitment to earning your trust through transparency rather than asking for it on faith.

Your voice in Tally

Discord is how we build
this together.

A crowd-sourced platform only works if the crowd has a say in where it goes. We are not building Tally in private and releasing updates to passive users. We are building it with the people who depend on it — and Discord is where that conversation happens.

Shape the routing logic. When the bandit makes decisions you disagree with, tell us. When a task shape is being misclassified, surface it. When you have workload patterns the current model doesn't handle well, bring them to the community. The people who find the edges are the people who make the platform better at the edges.

Influence the roadmap. Consumers and providers have fundamentally different needs. MCP monetisation looks different from LLM cost optimisation. Real-time streaming workloads have different constraints from batch jobs. We need to hear from all of them — because the roadmap should reflect the actual distribution of how people use this, not our assumptions about it.

Build in public. Architecture decisions, API changes, new shape classifications, pricing experiments — we will surface them in Discord before they ship. The community gets early visibility and the opportunity to push back. That is how you give users a genuine voice rather than a feedback form nobody reads.

Join the community

Consumers, providers, and builders — all shaping the future of intelligent AI routing together. Discord is where it happens.

Join our Discord →

Be part of something smarter.

Every integration makes the platform better for everyone. Join the community and help shape what Tally becomes.

Next up

LLM Market

Live token prices across every major provider, updated every 6 hours.