Everything you need to integrate Tally into your application.
The Tally SDK is available as an npm package. It works in Node.js 18+ and any modern bundler (Vite, esbuild, Rollup).
npm install @tally/sdk
The SDK has zero runtime dependencies. The only requirement is an API key, which you get automatically when you create your account. The source is available on GitHub.
Add two calls to your existing LLM integration: route() before the LLM call, and telemetry() after.
import Anthropic from '@anthropic-ai/sdk'
import { TallyClient, buildEnvelope } from '@tally/sdk'
const anthropic = new Anthropic()
const tally = new TallyClient({ apiKey: process.env.TALLY_API_KEY! })
async function askAI(userMessage: string): Promise<string> {
// 1. Describe this task to Tally
const envelope = buildEnvelope({
taskType: 'qa-simple',
contextLength: 'short',
})
// 2. Ask which model to use
const availableModels = [
'claude-haiku-3-5-20251001',
'claude-sonnet-4-5-20251001',
]
const { recommended_model, exploration_flag } =
await tally.route(envelope, availableModels)
// 3. Call the recommended model
const t0 = Date.now()
const response = await anthropic.messages.create({
model: recommended_model,
max_tokens: 1024,
messages: [{ role: 'user', content: userMessage }],
})
const content = response.content[0].type === 'text'
? response.content[0].text : ''
const ntokInput = response.usage.input_tokens
const ntokOutput = response.usage.output_tokens
// 4. Report outcome back to Tally (fire and forget)
tally.telemetry({
semantic_envelope: envelope,
model_used: recommended_model,
recommended_model,
outcome: 'success',
ntok_input: ntokInput,
ntok_output: ntokOutput,
})
return content
}
After signing up, your API key is available in the Portal under your account settings. It looks like:
tly_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Set it as an environment variable. Never commit it to source control.
export TALLY_API_KEY=tly_your_key_here
The main entry point. Instantiate once per application.
import { TallyClient } from '@tally/sdk'
const tally = new TallyClient({
apiKey: process.env.TALLY_API_KEY!, // required
endpoint: 'https://api.tallyy.org', // optional override
calibrationMode: 'off', // 'off' | 'standard' | 'aggressive'
sdkVersion: '1.0.0', // optional — enables 'certified' trust level
onError: (err) => console.error(err), // optional error handler
})
| Option | Type | Default | Description |
|---|---|---|---|
| apiKey | string | — | Required. Your Tally API key. |
| endpoint | string | https://api.tallyy.org | Override the Tally API endpoint. |
| calibrationMode | CalibrationMode | 'off' | Warm-up mode for new installations. Use 'standard' for first week. |
| sdkVersion | string | undefined | Set to enable 'certified' trust level for telemetry. |
Constructs a Semantic Envelope describing the shape of the task. This is what Tally reasons over — not the content of your prompt.
import { buildEnvelope } from '@tally/sdk'
const envelope = buildEnvelope({
taskType: 'code-debug', // see Task Types below
structureType: 'code', // 'prose' | 'code' | 'json' | 'list' | 'mixed'
contextLength: 'long', // 'short' | 'medium' | 'long' | 'very-long'
estimatedTokens: '2k-8k', // optional token bucket hint
toolsDescriptors: [{ type: 'search' }], // optional tool list
timeSensitive: false, // optional urgency hint
})
Only taskType is required. All other fields are optional hints that improve routing accuracy.
Asks Tally which model to use for this envelope. Returns the recommended model and an exploration flag.
const result = await tally.route(envelope, availableModels, {
// Optional hints
intentHint: 'The user is debugging a React hook',
userSkillBand: 'expert',
explicitTimeSensitivity: 'low',
})
console.log(result.recommended_model) // 'claude-haiku-3-5-20251001'
console.log(result.exploration_flag) // false — exploiting known best
The availableModels array is the set of models your account has access to. Tally only recommends from this list.
Reports the outcome of an LLM call. Fire-and-forget — non-blocking. The SDK retries on failure.
tally.telemetry({
// Envelope — same one you passed to route()
semantic_envelope: envelope,
// What model was used
model_used: 'claude-haiku-3-5-20251001',
// What Tally recommended (for adoption tracking)
recommended_model: 'claude-haiku-3-5-20251001',
// Outcome
outcome: 'success', // 'success' | 'fail'
// Token counts (for cost estimation)
ntok_input: 4200,
ntok_output: 380,
// Optional quality signal [0, 1]
quality_score: 0.92,
// Optional session context (enables re-ask tracking)
session_id: 'sess_abc123',
conversation_turn_index: 3,
})
The taskType field in your envelope tells Tally what kind of work is happening. Choose the closest match.
code-debug
code-review
code-generation
architecture-design
data-analysis
summarisation
qa-simple
creative-writing
task-planning
research-synthesis
document-generation
classification
When you first deploy Tally, it has no data about your workload. Calibration mode forces a higher exploration rate so the bandit can build an initial model quickly.
offNormal operation. Use for established workloads where the bandit has enough data.
standardRecommended for first 1–2 weeks. Balanced exploration to build the model faster.
aggressiveHigh exploration rate. Use when launching with a completely new model pool.
Every Tally account comes with a personal org. You can create additional team orgs and invite collaborators. API keys are scoped to orgs, so you can track costs per team or product.
Manage orgs and invite team members from the Portal under Account → Organisations.
The harness is a CLI tool that generates realistic synthetic workloads and drives them through Tally's routing engine. Use it to:
# Install harness globally
npm install -g @tally/harness
# Run 100 events, random scenario mix, 5 events/sec
TALLY_API_KEY=tly_xxx tally-harness
# Run 500 events, code-debug only, faster rate
TALLY_API_KEY=tly_xxx tally-harness --count 500 --rate 20 --scenario code-debug
# See available scenarios
tally-harness --list
# Calibration warmup (aggressive)
TALLY_API_KEY=tly_xxx tally-harness --count 2000 --calibration standard --burst
Or try the browser-based simulation — no API key required.
The Inspector is the admin observability dashboard. It shows:
Inspector access requires admin credentials. Contact team@tallyy.org for access.
Sign in with Google. Your key is ready the moment your account is created.