Model reference · Synced 2025-04-29
Llama 4 Scout 17B 16E Instruct
Llama 4 Scout 17B 16E Instruct is an AI model from Nvidia. 128K context window. Capabilities: reasoning, tool calling, multimodal vision, open weights. Available on 8 providers. Cheapest listing: $0 input / $0 output per 1M tokens.
Quick facts
- Cheapest input: $0 per 1M tokens (Nvidia)
- Cheapest output: $0 per 1M tokens
- Context window: 128K tokens
- Max output: 4K tokens
- Release date: 2025-04-02
- Knowledge cutoff: 2024-02
- Capabilities: reasoning, tool calling, multimodal vision, open weights
- Provider count: 8
Provider pricing
Same model, different providers, different prices. Cheapest first.
| Provider | Input / 1M | Output / 1M | Context | Listed |
|---|---|---|---|---|
| Nvidia | $0 | $0 | 128K | 2025-04-02 |
| GitHub Models | $0 | $0 | 128K | 2025-01-31 |
| Synthetic | $0.15 | $0.6 | 328K | 2025-04-05 |
| Weights & Biases | $0.17 | $0.66 | 64K | 2025-01-31 |
| Azure Cognitive Services | $0.2 | $0.78 | 128K | 2025-04-05 |
| Azure | $0.2 | $0.78 | 128K | 2025-04-05 |
| Cloudflare AI Gateway | $0.27 | $0.85 | 128K | 2025-04-16 |
| Cloudflare Workers AI | $0.27 | $0.85 | 128K | 2025-04-16 |
Prices synced daily from models.dev + provider docs.
How to use this model
If you're picking Llama 4 Scout 17B 16E Instruct for a project, the three things that matter most:
- Compare it side-by-side with one or two alternatives in the live comparison tool. Pricing differences matter more than benchmarks at scale.
- Pick the cheapest provider that meets your latency / SLA need. Big spread across providers for the same weights.
- Re-evaluate every 3 months. Frontier prices drop fast; a model that's cheapest today may not be in a quarter.
Related models
FAQ
How much does Llama 4 Scout 17B 16E Instruct cost? $0 input / $0 output per 1M tokens at the cheapest listing. See the table above for other providers.
What is the context window? 128K tokens.
Which providers offer it? Weights & Biases, Cloudflare AI Gateway, Azure Cognitive Services, Synthetic, Nvidia, Cloudflare Workers AI, Azure, GitHub Models.
Where do these numbers come from? models.dev + provider documentation, synced daily. About the data.