Pricing
Pay only for what you use. No subscriptions.
Anthropic, OpenAI, and Google models pass through at official prices — we make no markup. Chinese models carry a small infra premium and zero topup fee on every rail. Top up once, spend anywhere.
Topup fee on every payment rail
Markup on Anthropic / OpenAI / Google
Free credit on signup
Minimum top-up
Per-model pricing
All prices in USD per 1M tokens unless otherwise noted. Updated weekly from upstream rate cards.
Chat models
| Model | Input | Output |
|---|---|---|
DeepSeek V3.2🇨🇳 CN GPT-4-class general model. The most cost-effective model in the world for code & chat. | $0.14/1M | $0.28/1M |
Kimi K2.5🇨🇳 CN Long-context tool-use champion. Strong agentic behaviour, free 128K context. | $0.60/1M | $2.50/1M |
Qwen3 Max🇨🇳 CN 256K context. Strong on multilingual and bilingual reasoning. | $0.60/1M | $2.40/1M |
MiniMax M2.5🇨🇳 CN Fastest Chinese chat model — 130 tokens/sec on cold start. | $0.30/1M | $1.20/1M |
Claude Sonnet 4.6Global Workhorse Anthropic model. 5× cheaper than Opus, ~90% the quality. | $3.00/1M | $15.00/1M |
GPT-5.5Global Latest GPT generation. Pass-through pricing. | $5.00/1M | $20.00/1M |
Gemini 3 ProGlobal 2M token context. Pass-through pricing. | $1.25/1M | $5.00/1M |
Reasoning models
| Model | Input | Output |
|---|---|---|
DeepSeek R1🇨🇳 CN Open reasoning model rivalling o1 on math & logic, at 1/30 the price. | $0.55/1M | $2.19/1M |
Claude Opus 4.7Global Frontier reasoning. Pass-through pricing — exactly the official rate. | $15.00/1M | $75.00/1M |
o1Global OpenAI flagship reasoning. Pass-through pricing. | $15.00/1M | $60.00/1M |
Code models
| Model | Input | Output |
|---|---|---|
GLM-4.6🇨🇳 CN Coder-tuned. Strong on Chinese function calling and tool routing. | $0.50/1M | $1.50/1M |
Qwen3 Coder🇨🇳 CN Coder-specialised Qwen variant. SOTA on HumanEval-CN. | $0.30/1M | $1.50/1M |
Vision models
| Model | Input | Output |
|---|---|---|
Doubao 1.6 Pro🇨🇳 CN Strong on image understanding & Chinese OCR. Mature production model. | $0.40/1M | $1.20/1M |
GPT-4oGlobal Multimodal frontier. Pass-through pricing. | $2.50/1M | $10.00/1M |
Video models
| Model | Input | Output |
|---|---|---|
Kling 1.6🇨🇳 CN Best Chinese text-to-video. 5s/10s clips, 720p/1080p output. | $0.50 per second of video | — |
Payment rails — every one at zero fee.
Most aggregators charge 5–6% on top-ups. We don't. We make money on volume spread on Chinese models, not on your wallet.
USDT (TRC-20 / ERC-20)
OpenRouter 5%
Stripe (Card)
OpenRouter 5.5% credit card
We absorb the 2.9% gateway fee
WeChat Pay / Alipay (China rails)
Not supported by OpenRouter
Bank wire + VAT invoice (China B2B)
Enterprise only on most platforms
China B2B: bank wire + auto-issued 6% VAT e-invoice
Invoice arrives within 7 business days. Save your company name, tax ID, and bank details once.
Pricing FAQ
Why is there a markup on Chinese models?
Upstream Chinese providers charge in CNY, demand prepaid corporate accounts, and rate-limit aggressively. We absorb that infra cost (Singapore + Shanghai egress, retry logic, multi-channel failover) and pass through to you at one transparent rate. The markup ranges 0–80% by group; check each model on the table above.
Why do you not mark up GPT / Claude / Gemini?
These vendors take direct USD payment and have global infra. We have nothing to add — we just route. The only reason to use Routify for them is to consolidate billing and keep one wallet across all models.
Do you keep my data?
No. By default, prompts and completions are not logged. Aggregate metadata (token counts, latency, model id) is retained for billing for 90 days. You can opt into per-request logging for debugging from the dashboard.
What about volume discounts?
Above $1k/month spend, you qualify for a VIP rate (5% discount across all models). Above $10k/month, contact us for enterprise terms with dedicated channels and SLA.
Can I bring my own API key (BYOK)?
Yes — coming Q3 2026. You will be able to attach your own Anthropic / OpenAI / Google key and pay only the routing fee (0% on the first 1M requests/month).
How is billing computed?
Per-request, per-token. Each response carries `X-Routify-Cost-USD` and `X-Routify-Cost-CNY` headers so your code can audit immediately. Dashboard shows every request line-item, exportable as CSV/JSON.