Open source, MIT licensed

Track what your
AI agents spend.

Set a dollar limit per key, session, or user. Requests get rejected before they reach the provider.

by @smigolsmigol

# wrap any command

$ npx @f3d1/llmkit-cli -- python agent.py

... agent runs normally ...

claude-sonnet-4 $0.0847 1,204 in / 380 out cache saved $0.31

gpt-4.1-mini $0.0182 890 in / 241 out

Session total: $4.12 / $50.00 budget 14 reqs in 38s

11 providers. 730+ models priced. Cache-aware pricing that tracks read and write tokens separately.

Anthropic

29 models

OpenAI

145 models

Google Gemini

50 models

xAI Grok

39 models

DeepSeek

6 models

Groq

37 models

Mistral

63 models

Together

105 models

Fireworks

257 models

Ollama

local

OpenRouter

meta-gateway

Budget enforcement

Cost is reserved before the request. Exceeded means rejected, not logged after the fact.

Accurate costs

Prompt caching makes tokens up to 90% cheaper. We track cached and uncached separately.

Open source

MIT licensed. Self-host on Cloudflare Workers free tier. Your keys stay in your infra.

Free while in beta. No credit card.

MIT licensed. Built with Claude Code. Source on GitHub