LLM API Pricing 2026: 7 Providers Compared

LLM API Providers Software Pricing 2026

Compare pricing for 25 llm api providers tools. Find the right software for your budget.

25 Products
$0–$270 Price Range /user/mo
$11 Median /user/mo
10 Free Tiers

LLM API Providers software pricing ranges from $0 to $270 per user/month in 2026. The typical cost is around $11/user/month across 25 popular tools. Top picks: Amazon Bedrock ($0.07–$75/user/mo), Anyscale ($0.15–$5/user/mo), Baidu ERNIE API ($0.10–$10/user/mo), and 22 more. 10 of 25 tools offer free tiers for small teams or limited use.

All LLM API Providers Tools

Compare all side-by-side →

Amazon Bedrock

$0.07–$75/per million tokens
On-Demand (pay per token) $null Provisioned Throughput $null Enterprise (Bedrock + AWS deal) Custom
View Pricing →

Anyscale

$0.15–$5/per million tokens
Anyscale Endpoints $null Managed Ray Clusters Custom
View Pricing →
B

Baidu ERNIE API

$0.10–$10/per million tokens
Pay-as-you-go (ERNIE 4.5, 4 Turbo, X1) $null Enterprise Custom
View Pricing →

Cerebras Inference API

$0.10–$6/per million tokens
Free tier (Developer) Free Pay-as-you-go $null Enterprise Custom
View Pricing →

Claude API

$0.03–$75/per million tokens
API (Pay-as-you-go) Custom Enterprise Custom
View Pricing →
C

Cloudflare Workers AI

$0.05–$5/per million tokens
Free tier Free Pay-as-you-go (Neurons / tokens) $null Enterprise Custom
View Pricing →

Cohere API

$0.04–$10/per million tokens
Trial (Free) Free Command R (Pay-as-you-go) Custom Command R+ / Command A (Pay-as-you-go) Custom +1
View Pricing →

DeepInfra

$0.00–$82.50/per million tokens
Pay-as-you-go Custom
View Pricing →

Fireworks AI

Free–$11/per million tokens / hour
Serverless Custom On-Demand (H100/H200) Custom On-Demand (B200) Custom +2
View Pricing →

Google Gemini API

Free–$18/per million tokens
Free Free Flash-Lite (Paid) Custom Flash (Paid) Custom +1
View Pricing →

Groq

Free–Free/per million tokens
Free Free Developer Custom Enterprise Custom
View Pricing →

Lepton AI

$0.07–$4/per million tokens
Serverless Inference $null GPU Cloud $null
View Pricing →

MiniMax API

$0.20–$3/per million tokens
Pay-as-you-go (MiniMax M1, M2, abab6.5) $null Enterprise Custom
View Pricing →

Mistral AI API

$0.10–$6/per million tokens
Free Free Mistral Small Custom Mistral Medium Custom +1
View Pricing →

Moonshot Kimi API

$0.15–$10/per million tokens
Pay-as-you-go (Kimi K2, Moonshot v1 family) $null Enterprise Custom
View Pricing →

NVIDIA NIM

$0.10–$10/per million tokens
Developer (Free credits) Free Pay-as-you-go (hosted NIM endpoints) $null Enterprise (AI Enterprise license + DGX Cloud) Custom
View Pricing →

OctoAI

Custom pricing
Service Discontinued Custom
View Pricing →

OpenAI API

$0.20–$270/per million tokens
GPT-5.4 mini / nano (Economy) Custom GPT-5.5 / GPT-5.4 / Pro (Flagship) Custom Enterprise Custom
View Pricing →

OpenRouter

Free–$75/per million tokens
Free Models Free Pay-as-you-go $null
View Pricing →

Perplexity API

$1–$15/per million tokens + per-request fee
Sonar Custom Sonar Pro Custom Sonar Reasoning Pro Custom +1
View Pricing →
Q

Qwen API (Alibaba)

$0.05–$20/per million tokens
Pay-as-you-go (Qwen3, Qwen2.5, Qwen-VL) $null Enterprise Custom
View Pricing →

SambaNova Cloud

$0.10–$5/per million tokens
Free tier Free Developer (Pay-as-you-go) $null Enterprise Custom
View Pricing →

Together AI

$0.03–$9.95/per million tokens / hour
Serverless Custom Dedicated (1x H100) Custom Dedicated (1x H200) Custom +2
View Pricing →

Vercel AI SDK

Free–$20/per month (Vercel plan)
Hobby (Free) Free Pro $20 Enterprise Custom
View Pricing →
x

xAI Grok API

$0.20–$30/per million tokens
Pay-as-you-go (Grok 4, Grok 4 fast, Grok Code Fast) $null Enterprise Custom
View Pricing →

LLM API Providers Comparisons

Groq vs Together AI Compare → Groq vs Fireworks AI Compare → Google Gemini API vs Claude Compare → Mistral AI API vs OpenAI Compare → Perplexity API vs OpenAI Compare → Cohere API vs OpenAI Compare → Claude API vs Google Gemini API Compare → Claude API vs Groq Compare → Claude API vs OpenAI API Compare → Claude API vs Together AI Compare → Google Gemini API vs Groq Compare → Google Gemini API vs OpenAI API Compare → Google Gemini API vs Together AI Compare → Groq vs OpenAI API Compare → OpenAI API vs Together AI Compare → Claude API vs DeepInfra Compare → Claude API vs DeepSeek Compare → Claude API vs Fireworks AI Compare → Claude API vs OpenRouter Compare → DeepInfra vs DeepSeek Compare → DeepInfra vs Fireworks AI Compare → DeepInfra vs Google Gemini API Compare → DeepInfra vs Groq Compare → DeepInfra vs OpenAI API Compare → DeepInfra vs OpenRouter Compare → DeepInfra vs Together AI Compare → DeepSeek vs Fireworks AI Compare → DeepSeek vs Google Gemini API Compare → DeepSeek vs Groq Compare → DeepSeek vs OpenAI API Compare → DeepSeek vs OpenRouter Compare → DeepSeek vs Together AI Compare → Fireworks AI vs Google Gemini API Compare → Fireworks AI vs OpenAI API Compare → Fireworks AI vs OpenRouter Compare → Fireworks AI vs Together AI Compare → Google Gemini API vs OpenRouter Compare → Groq vs OpenRouter Compare → OpenAI API vs OpenRouter Compare → OpenRouter vs Together AI Compare → OpenAI API vs Cohere API Compare → OpenAI API vs Voyage AI Compare → OpenAI API vs Jina Embeddings Compare → OpenAI API vs Mixedbread Compare → Cohere API vs Voyage AI Compare → Cohere API vs Jina Embeddings Compare → Cohere API vs Mixedbread Compare → Amazon Bedrock vs Claude API Compare → Amazon Bedrock vs DeepInfra Compare → Amazon Bedrock vs DeepSeek Compare → Amazon Bedrock vs Fireworks AI Compare → Amazon Bedrock vs Google Gemini API Compare → Amazon Bedrock vs Groq Compare → Amazon Bedrock vs MiniMax API Compare → Amazon Bedrock vs Moonshot Kimi API Compare → Amazon Bedrock vs NVIDIA NIM Compare → Amazon Bedrock vs OpenAI API Compare → Amazon Bedrock vs OpenRouter Compare → Amazon Bedrock vs Qwen API (Alibaba) Compare → Amazon Bedrock vs Together AI Compare → Amazon Bedrock vs xAI Grok API Compare → Claude API vs MiniMax API Compare → Claude API vs Moonshot Kimi API Compare → Claude API vs NVIDIA NIM Compare → Claude API vs Qwen API (Alibaba) Compare → Claude API vs xAI Grok API Compare → DeepInfra vs MiniMax API Compare → DeepInfra vs Moonshot Kimi API Compare → DeepInfra vs NVIDIA NIM Compare → DeepInfra vs Qwen API (Alibaba) Compare → DeepInfra vs xAI Grok API Compare → DeepSeek vs MiniMax API Compare → DeepSeek vs Moonshot Kimi API Compare → DeepSeek vs NVIDIA NIM Compare → DeepSeek vs Qwen API (Alibaba) Compare → DeepSeek vs xAI Grok API Compare → Fireworks AI vs MiniMax API Compare → Fireworks AI vs Moonshot Kimi API Compare → Fireworks AI vs NVIDIA NIM Compare → Fireworks AI vs Qwen API (Alibaba) Compare → Fireworks AI vs xAI Grok API Compare → Google Gemini API vs MiniMax API Compare → Google Gemini API vs Moonshot Kimi API Compare → Google Gemini API vs NVIDIA NIM Compare → Google Gemini API vs Qwen API (Alibaba) Compare → Google Gemini API vs xAI Grok API Compare → Groq vs MiniMax API Compare → Groq vs Moonshot Kimi API Compare → Groq vs NVIDIA NIM Compare → Groq vs Qwen API (Alibaba) Compare → Groq vs xAI Grok API Compare → MiniMax API vs Moonshot Kimi API Compare → MiniMax API vs NVIDIA NIM Compare → MiniMax API vs OpenAI API Compare → MiniMax API vs OpenRouter Compare → MiniMax API vs Qwen API (Alibaba) Compare → MiniMax API vs Together AI Compare → MiniMax API vs xAI Grok API Compare → Moonshot Kimi API vs NVIDIA NIM Compare → Moonshot Kimi API vs OpenAI API Compare → Moonshot Kimi API vs OpenRouter Compare → Moonshot Kimi API vs Qwen API (Alibaba) Compare → Moonshot Kimi API vs Together AI Compare → Moonshot Kimi API vs xAI Grok API Compare → NVIDIA NIM vs OpenAI API Compare → NVIDIA NIM vs OpenRouter Compare → NVIDIA NIM vs Qwen API (Alibaba) Compare → NVIDIA NIM vs Together AI Compare → NVIDIA NIM vs xAI Grok API Compare → OpenAI API vs Qwen API (Alibaba) Compare → OpenAI API vs xAI Grok API Compare → OpenRouter vs Qwen API (Alibaba) Compare → OpenRouter vs xAI Grok API Compare → Qwen API (Alibaba) vs Together AI Compare → Qwen API (Alibaba) vs xAI Grok API Compare → Together AI vs xAI Grok API Compare → Predibase vs OpenAI API Compare → Predibase vs Together AI Compare → Predibase vs Fireworks AI Compare → Predibase vs Mistral AI API Compare → OpenAI API vs Mistral AI API Compare → Together AI vs Mistral AI API Compare → Fireworks AI vs Mistral AI API Compare → OctoAI vs Perplexity API Compare → OctoAI vs Mistral AI API Compare → OctoAI vs Cohere API Compare → OctoAI vs OpenAI API Compare → OctoAI vs DeepInfra Compare → OctoAI vs Claude API Compare → OctoAI vs Amazon Bedrock Compare → OctoAI vs Together AI Compare → OctoAI vs Fireworks AI Compare → OctoAI vs Cloudflare Workers AI Compare → OctoAI vs xAI Grok API Compare → OctoAI vs Vercel AI SDK Compare → OctoAI vs Qwen API (Alibaba) Compare → OctoAI vs Moonshot Kimi API Compare → Perplexity API vs Mistral AI API Compare → Perplexity API vs Cohere API Compare → Perplexity API vs OpenAI API Compare → Perplexity API vs DeepInfra Compare → Perplexity API vs Claude API Compare → Perplexity API vs Amazon Bedrock Compare → Perplexity API vs Together AI Compare → Perplexity API vs Fireworks AI Compare → Perplexity API vs Cloudflare Workers AI Compare → Perplexity API vs xAI Grok API Compare → Perplexity API vs Vercel AI SDK Compare → Perplexity API vs Qwen API (Alibaba) Compare → Perplexity API vs Moonshot Kimi API Compare → Mistral AI API vs Cohere API Compare → Mistral AI API vs DeepInfra Compare → Mistral AI API vs Claude API Compare → Mistral AI API vs Amazon Bedrock Compare → Mistral AI API vs Cloudflare Workers AI Compare → Mistral AI API vs xAI Grok API Compare → Mistral AI API vs Vercel AI SDK Compare → Mistral AI API vs Qwen API (Alibaba) Compare → Mistral AI API vs Moonshot Kimi API Compare → Cohere API vs DeepInfra Compare → Cohere API vs Claude API Compare → Cohere API vs Amazon Bedrock Compare → Cohere API vs Together AI Compare → Cohere API vs Fireworks AI Compare → Cohere API vs Cloudflare Workers AI Compare → Cohere API vs xAI Grok API Compare → Cohere API vs Vercel AI SDK Compare → Cohere API vs Qwen API (Alibaba) Compare → Cohere API vs Moonshot Kimi API Compare → OpenAI API vs Cloudflare Workers AI Compare → OpenAI API vs Vercel AI SDK Compare → DeepInfra vs Cloudflare Workers AI Compare → DeepInfra vs Vercel AI SDK Compare → Claude API vs Cloudflare Workers AI Compare → Claude API vs Vercel AI SDK Compare → Amazon Bedrock vs Cloudflare Workers AI Compare → Amazon Bedrock vs Vercel AI SDK Compare → Together AI vs Cloudflare Workers AI Compare → Together AI vs Vercel AI SDK Compare → Fireworks AI vs Cloudflare Workers AI Compare → Fireworks AI vs Vercel AI SDK Compare → Cloudflare Workers AI vs xAI Grok API Compare → Cloudflare Workers AI vs Vercel AI SDK Compare → Cloudflare Workers AI vs Qwen API (Alibaba) Compare → Cloudflare Workers AI vs Moonshot Kimi API Compare → xAI Grok API vs Vercel AI SDK Compare → Vercel AI SDK vs Qwen API (Alibaba) Compare → Vercel AI SDK vs Moonshot Kimi API Compare → OctoAI vs NVIDIA NIM Compare → OctoAI vs Cerebras Inference API Compare → OctoAI vs SambaNova Cloud Compare → OctoAI vs MiniMax API Compare → OctoAI vs OpenRouter Compare → Perplexity API vs NVIDIA NIM Compare → Perplexity API vs Cerebras Inference API Compare → Perplexity API vs SambaNova Cloud Compare → Perplexity API vs MiniMax API Compare → Perplexity API vs OpenRouter Compare → Mistral AI API vs NVIDIA NIM Compare → Mistral AI API vs Cerebras Inference API Compare → Mistral AI API vs SambaNova Cloud Compare → Mistral AI API vs MiniMax API Compare → Mistral AI API vs OpenRouter Compare → Cohere API vs NVIDIA NIM Compare → Cohere API vs Cerebras Inference API Compare → Cohere API vs SambaNova Cloud Compare → Cohere API vs MiniMax API Compare → Cohere API vs OpenRouter Compare → OpenAI API vs Cerebras Inference API Compare → OpenAI API vs SambaNova Cloud Compare → DeepInfra vs Cerebras Inference API Compare → DeepInfra vs SambaNova Cloud Compare → Claude API vs Cerebras Inference API Compare → Claude API vs SambaNova Cloud Compare → Amazon Bedrock vs Cerebras Inference API Compare → Amazon Bedrock vs SambaNova Cloud Compare → Together AI vs Cerebras Inference API Compare → Together AI vs SambaNova Cloud Compare → Fireworks AI vs Cerebras Inference API Compare → Fireworks AI vs SambaNova Cloud Compare → Cloudflare Workers AI vs NVIDIA NIM Compare → Cloudflare Workers AI vs Cerebras Inference API Compare → Cloudflare Workers AI vs SambaNova Cloud Compare → Cloudflare Workers AI vs MiniMax API Compare → Cloudflare Workers AI vs OpenRouter Compare → xAI Grok API vs Cerebras Inference API Compare → xAI Grok API vs SambaNova Cloud Compare → Vercel AI SDK vs NVIDIA NIM Compare → Vercel AI SDK vs Cerebras Inference API Compare → Vercel AI SDK vs SambaNova Cloud Compare → Vercel AI SDK vs MiniMax API Compare → Vercel AI SDK vs OpenRouter Compare → Qwen API (Alibaba) vs Cerebras Inference API Compare → Qwen API (Alibaba) vs SambaNova Cloud Compare → Moonshot Kimi API vs Cerebras Inference API Compare → Moonshot Kimi API vs SambaNova Cloud Compare → NVIDIA NIM vs Cerebras Inference API Compare → NVIDIA NIM vs SambaNova Cloud Compare → Cerebras Inference API vs SambaNova Cloud Compare → Cerebras Inference API vs MiniMax API Compare → Cerebras Inference API vs OpenRouter Compare → SambaNova Cloud vs MiniMax API Compare → SambaNova Cloud vs OpenRouter Compare → OctoAI vs Baidu ERNIE API Compare → OctoAI vs Lepton AI Compare → OctoAI vs Anyscale Compare → OctoAI vs Google Gemini API Compare → OctoAI vs Groq Compare → Perplexity API vs Baidu ERNIE API Compare → Perplexity API vs Lepton AI Compare → Perplexity API vs Anyscale Compare → Perplexity API vs Google Gemini API Compare → Perplexity API vs Groq Compare → Mistral AI API vs Baidu ERNIE API Compare → Mistral AI API vs Lepton AI Compare → Mistral AI API vs Anyscale Compare → Mistral AI API vs Google Gemini API Compare → Mistral AI API vs Groq Compare → Cohere API vs Baidu ERNIE API Compare → Cohere API vs Lepton AI Compare → Cohere API vs Anyscale Compare → Cohere API vs Google Gemini API Compare → Cohere API vs Groq Compare → OpenAI API vs Baidu ERNIE API Compare → OpenAI API vs Lepton AI Compare → OpenAI API vs Anyscale Compare → DeepInfra vs Baidu ERNIE API Compare → DeepInfra vs Lepton AI Compare → DeepInfra vs Anyscale Compare → Claude API vs Baidu ERNIE API Compare → Claude API vs Lepton AI Compare → Claude API vs Anyscale Compare → Amazon Bedrock vs Baidu ERNIE API Compare → Amazon Bedrock vs Lepton AI Compare → Amazon Bedrock vs Anyscale Compare → Together AI vs Baidu ERNIE API Compare → Together AI vs Lepton AI Compare → Together AI vs Anyscale Compare → Fireworks AI vs Baidu ERNIE API Compare → Fireworks AI vs Lepton AI Compare → Fireworks AI vs Anyscale Compare → Cloudflare Workers AI vs Baidu ERNIE API Compare → Cloudflare Workers AI vs Lepton AI Compare → Cloudflare Workers AI vs Anyscale Compare → Cloudflare Workers AI vs Google Gemini API Compare → Cloudflare Workers AI vs Groq Compare → xAI Grok API vs Baidu ERNIE API Compare → xAI Grok API vs Lepton AI Compare → xAI Grok API vs Anyscale Compare → Vercel AI SDK vs Baidu ERNIE API Compare → Vercel AI SDK vs Lepton AI Compare → Vercel AI SDK vs Anyscale Compare → Vercel AI SDK vs Google Gemini API Compare → Vercel AI SDK vs Groq Compare → Qwen API (Alibaba) vs Baidu ERNIE API Compare → Qwen API (Alibaba) vs Lepton AI Compare → Qwen API (Alibaba) vs Anyscale Compare → Moonshot Kimi API vs Baidu ERNIE API Compare → Moonshot Kimi API vs Lepton AI Compare → Moonshot Kimi API vs Anyscale Compare → NVIDIA NIM vs Baidu ERNIE API Compare → NVIDIA NIM vs Lepton AI Compare → NVIDIA NIM vs Anyscale Compare → Cerebras Inference API vs Baidu ERNIE API Compare → Cerebras Inference API vs Lepton AI Compare → Cerebras Inference API vs Anyscale Compare → Cerebras Inference API vs Google Gemini API Compare → Cerebras Inference API vs Groq Compare → SambaNova Cloud vs Baidu ERNIE API Compare → SambaNova Cloud vs Lepton AI Compare → SambaNova Cloud vs Anyscale Compare → SambaNova Cloud vs Google Gemini API Compare → SambaNova Cloud vs Groq Compare → MiniMax API vs Baidu ERNIE API Compare → MiniMax API vs Lepton AI Compare → MiniMax API vs Anyscale Compare → OpenRouter vs Baidu ERNIE API Compare → OpenRouter vs Lepton AI Compare → OpenRouter vs Anyscale Compare → Baidu ERNIE API vs Lepton AI Compare → Baidu ERNIE API vs Anyscale Compare → Baidu ERNIE API vs Google Gemini API Compare → Baidu ERNIE API vs Groq Compare → Lepton AI vs Anyscale Compare → Lepton AI vs Google Gemini API Compare → Lepton AI vs Groq Compare → Anyscale vs Google Gemini API Compare → Anyscale vs Groq Compare →

LLM API Providers Pricing FAQ

01 What are LLM API providers?

LLM API providers offer access to large language models via API, enabling developers to add AI capabilities to applications without hosting models themselves. They charge per token (input and output) and compete on price, speed, model selection, and features like web search grounding or RAG optimization.

02 How much do LLM APIs cost in 2026?

LLM API pricing is per-token and varies widely by model size and provider. Small models (under 8B parameters) cost $0.02-0.20 per million tokens on Groq, Together AI, and Mistral. Mid-range models cost $0.30-1.25 per million tokens (Gemini Flash, Mistral Medium). Frontier models (GPT-4o, Claude Sonnet, Gemini Pro) cost $1-5 per million input tokens. Perplexity Sonar adds per-request fees on top of token costs.

03 Which LLM API provider is cheapest?

For small open-source models, Groq (from $0.05/M tokens) and Mistral Nemo ($0.02/M) are the cheapest. For frontier-quality models, Mistral Large 3 at $0.50/$1.50 per million tokens is dramatically cheaper than GPT-4o or Claude Sonnet. Google Gemini Flash-Lite at $0.10/$0.40 per million tokens offers frontier quality at budget prices. Cohere Command R7B is cheapest for RAG at $0.037/M input tokens.

04 Which LLM API provider has the best free tier?

Google Gemini API offers the best free tier: 1,500 requests/day on Flash models through Google AI Studio with no credit card required. Groq offers a free API key with rate-limited access to all models. Mistral offers a free trial tier via La Plateforme. Cohere offers a free Trial API key for non-commercial use. Together AI and Fireworks AI offer $1 in free credits. Perplexity API has no free tier.

05 What is the difference between per-token and per-request LLM API pricing?

Most LLM APIs charge per token (input tokens + output tokens × rate). Perplexity API is unique in adding a per-request fee on top of token costs — every Sonar query incurs a $5-14 per 1,000 requests charge based on search context depth. This dual model reflects the cost of real-time web search bundled into each query. When comparing Perplexity to other APIs, you must add both token costs and request fees to get the true cost per query.