Tally is a witness to your AI activity, not a participant in it. Your prompts, your data, your conversations — they never pass through us. Here's exactly what that means in practice.
Last updated: February 2026 · Effective immediately
When you use Tally, your application talks directly to the LLMs and MCP servers you've configured. Tally does not sit in that data path. We do not proxy your requests, buffer your messages, or touch the content of any conversation.
What Tally receives is a structural signal — a semantic shape — computed by the Tally SDK inside your own application before anything leaves your stack. That shape describes the structure of a call: how long the context is, what kind of task it looks like, whether tools are involved. It contains none of your actual data.
Your content never reaches our servers. This is not a policy choice we could reverse on a bad day — it is how the system is built. The SDK computes shapes locally. Shapes are structural fingerprints. Fingerprints cannot be reversed into content.
Tally's job is to observe patterns across many calls and offer routing advice. You decide whether to take that advice. We have no other role in your AI interactions.
Every call you route through Tally produces a semantic envelope — a small structured record of what kind of call it was, computed entirely from metadata your SDK already has access to.
Want to see exactly what a shape looks like? Paste any API context into the Semantic Shape Demo on our live demo page. You'll see exactly the same structural fingerprint the SDK computes — and you'll notice it contains nothing you'd consider sensitive.
Tally observes patterns, builds confidence about which models perform best for which shapes, and offers a recommendation when you ask for one. You determine what to do with that recommendation.
Your application calls tally.route(shape), receives back a suggested model, and then makes its own LLM call. Tally never makes that call on your behalf. You are always the one initiating contact with the model provider.
If you disagree with a Tally recommendation — or want to ignore routing advice entirely — nothing breaks. The recommendation is advisory. The decision is yours.
To operate the service, we store the following:
We do not sell this data. We do not use it to train general-purpose AI models. Shape telemetry from your account is used solely to improve routing recommendations for your account and, in aggregate and anonymized form, to improve Tally's shared cluster models.
Tally includes an MCP Market — an aggregate view of which MCP servers are most used across all public accounts, and which shapes are most expensive when MCPs are involved. This is opt-out by default.
When you register a Consumer MCP in your account, you'll see a Private MCP toggle. Enabling it excludes that MCP from all aggregate market data. Your usage of that server, its name, endpoint, and any associated shapes will not appear in the MCP Market or any other cross-account view.
Private MCPs are fully functional — the flag affects only visibility in aggregate reporting, not routing behaviour.
The Tally portal uses session cookies to maintain your login state. We do not use advertising trackers, third-party analytics cookies, or cross-site tracking of any kind.
The public tallyy.org website uses no tracking cookies. The Live Demo page uses sessionStorage to remember that you passed the verification gate for the current browser session — this never leaves your device.
You may request deletion of your account and all associated data at any time by emailing team@tallyy.org. We will confirm deletion within 7 business days.
All data in transit is encrypted via TLS 1.3. Data at rest is encrypted using AES-256. API keys are stored encrypted and are never returned in full after initial creation — only a masked prefix is shown in the portal.
We follow a principle of least privilege internally: no employee has routine access to customer account data. Access requires a logged, time-limited request and is audited.
If you discover a security vulnerability, please report it responsibly to team@tallyy.org. We take these reports seriously and will respond within 48 hours.
Tally uses a small set of infrastructure providers to operate the service:
We do not share shape telemetry, routing data, or account information with any third party for commercial purposes. We do not have data-sharing agreements with any LLM provider.
Remember: your LLM provider receives your actual call content directly from your application. Tally is not party to that transaction. Review your LLM provider's privacy policy for how they handle your prompts and completions.
If we make material changes to this policy — particularly anything that affects what data we collect or how we use it — we will email registered users at least 14 days before the change takes effect and post a notice in the portal.
Changes that narrow what we collect (i.e., we decide to collect less) take effect immediately without notice.
The core architectural commitment — that content never reaches our servers — is not a policy matter and is not subject to change by policy update. It is enforced by how the system is built.
Tally's recommendations are based on one thing: which model has performed best for calls that look like yours. Not which model paid us. Not which model has a partnership agreement with us. We will never accept money, equity, data arrangements, or any other consideration in exchange for routing preference.
We are completely model-agnostic. Our job is to learn which tool performs best for each task — and report that honestly, regardless of who made the tool. The moment we accept a payment to favour a model, Tally's recommendations become worthless. We know that. It will not happen.
If you build an LLM and want to help us understand how to better characterize the shapes your model handles well, we want to hear from you. We would welcome telemetry partnerships with any model provider who wants to help us make more accurate routing calls — on the condition that the data flows both ways and no routing preference is implied or expected.
Better shape definitions mean better calls for every user, regardless of which model wins. That's a collaboration worth having.
Today, Tally watches individual API calls — the shape of each request, the outcome, the cost. But the plan extends further.
We intend to watch Looms: extended sessions, conversation threads, patterns across many responses from the same model over time. A single answer is hard to judge. A hundred answers to similar questions from the same model tells a story. If a model is systematically steering answers to protect a commercial relationship — recommending certain vendors, downplaying certain risks, hedging in ways that benefit its maker — that pattern will show up in Loom-level telemetry before it shows up anywhere else.
Before AdWords ruins agents the way it ruined search, Tally will put up a fight.
Paid placement destroyed search's usefulness for a generation. The same pressure is already building on AI. Developers, enterprises, and end users deserve a layer that watches for it — one that can't be bought. That's what Tally is building. That's what our community is building with us.
This is not our mission today. Today the mission is saving you money and making smarter routing calls. But the infrastructure we're building — shape telemetry, Loom-level observation, cross-model benchmarking — is exactly the infrastructure needed to hold models accountable. We are building it with that future in mind.
Questions about this policy or how your data is handled?
We're a small team. You'll hear from a person, not a bot.