This is the most complete AI API pricing comparison available as of May 2026. All prices verified against official provider documentation and independent testing. Prices are per million tokens in USD.
Complete Pricing Table
| Model | Input $/M | Output $/M | Max Context | tok/s | Efficiency |
|---|---|---|---|---|---|
| Qwen3-8B | $0.01 | $0.01 | 32K | 156 | 15,600 |
| GLM-4-9B | $0.01 | $0.01 | 128K | 110 | 11,000 |
| Step-3.5-Flash | $0.08 | $0.15 | 32K | 160 | 1,066 |
| Qwen3.5-27B | $0.12 | $0.19 | 128K | 95 | 500 |
| DeepSeek V4 Flash | $0.18 | $0.25 | 128K | 142 | 568 |
| Qwen3-32B | $0.18 | $0.28 | 128K | 128 | 457 |
| Qwen-MT-Turbo | $0.18 | $0.30 | 32K | 90 | 300 |
| Qwen3-Coder-30B | $0.20 | $0.35 | 128K | 105 | 300 |
| DeepSeek V3.2 | $0.32 | $0.38 | 128K | 78 | 205 |
| Hunyuan-Turbo | $0.35 | $0.57 | 32K | 118 | 207 |
| GLM-4-32B | $0.49 | $0.56 | 128K | 72 | 128 |
| DeepSeek V4 Pro | $0.52 | $0.75 | 128K | 55 | 73 |
| GLM-5 | $0.73 | $1.92 | 128K | 48 | 25 |
| Kimi K2.5 | $0.59 | $3.00 | 128K | 52 | 17 |
| DeepSeek-R1 | $1.20 | $2.50 | 64K | 35 | 14 |
Cost-Efficiency Ranking
The efficiency score is tokens-per-second divided by output price. Higher = more speed for your dollar. Qwen3-8B dominates because it's both the fastest and cheapest. DeepSeek V4 Flash is the efficiency champion among mid-tier models.
All models accessible via Global API — unified endpoint, PayPal billing.