Token Demand Index

The output side of the AI-compute economy — the companion to the Compute Tightness Index (which tracks the GPU input). The viral charts show the falling price of a token and conclude the AI trade is deflating. The number that actually settles it is demand — and demand is exploding: +93% since 2026-05-06, with open-source already carrying roughly half of all tracked inference tokens at a fraction of frontier price. A market getting bigger, not a margin getting crushed.

192.8Demand index (100 = 2026-05-06)
45.3TTokens / week
▲ +84%Demand, 30 days
46.5%Open-source share
$2.18Effective $/1M
$98.7MImplied spend/week*

As of 2026-06-25, tracked AI inference runs at 45.3T tokens/week — up +84% in 30 days and +93% since 2026-05-06. Open-source carries 46.5% of those tokens, holding the demand-weighted price near $2.18/1M — a fraction of frontier list prices even as frontier models themselves got pricier.

⤓ Download full daily series (CSV)

Token demand vs the effective price of inference

Total tokens/dayEffective $/1M (right)

Who consumes the tokens

Open-sourceOtherClaudeOpenAIGooglexAI

By provider group — latest snapshot

GroupBlended $/1MTokens/weekShare30d demand
Open-source$0.4021.1T46.5%▲ +125%
Other$0.9310.2T22.5%▲ +266%
Claude$10.006.6T14.5%▲ +19%
OpenAI$3.473.4T7.4%▲ +22%
Google$0.854.0T8.8%▼ -3%
xAI$1.56124.1B0.3%▲ +13%

Open-source = aggregated weights-available labs (DeepSeek, Qwen, Llama, Mistral, etc.). "Other" = unclassified OpenRouter providers. Share = % of tracked tokens. *Implied spend = list price × tokens, not realized revenue.

Methodology

Built from OpenRouter's model rankings + pricing, grouped into Claude / OpenAI / Google / xAI / Open-source / Other. Token counts are OpenRouter's trailing-7-day figures (the rankings "week" view) — i.e. each point is tokens processed over the prior week, not a single day. Per group: blended median $/1M (75/25 input/output) and weekly total_tokens. Demand index = latest weekly total ÷ the first week's total × 100. Effective $/1M = demand-weighted blended price (as-of prices carried forward across snapshots where only one side updated). Series since 2026-05-06; it grows as new snapshots land. Full history as CSV. Coverage is OpenRouter traffic — a large but partial slice of the market.

Pair it with the Compute Tightness Index (GPU input) for both sides of the AI-compute economy. Query the raw data from an agent via the SwiftAlerts MCP endpoint (get_inference_economics).