Baseten Pricing 2026
Complete pricing guide with plans, hidden costs, and cost analysis
Baseten uses custom pricing — contact their sales team for a quote.
Baseten uses custom pricing as of May 2026 with 3 plans available. Contact Baseten directly for a personalized quote. Plan: Basic (free). Enterprise pricing is available on request. Pricing depends on your chosen tier, contract length, and negotiated discounts.
Use the interactive pricing calculator to estimate your exact cost based on team size and requirements.
- Free tier: Yes
Baseten offers 3 pricing tiers: Basic, Pro, Enterprise. The Pro plan is teams with predictable high-volume inference needing reserved capacity.
Compared to other ai model hosting & inference software, Baseten is positioned at the budget-friendly price point.
- 1 documented hidden costs beyond list price
How much does Baseten cost?
Baseten Pricing Overview
Baseten uses custom pricing — contact their sales team for a quote. The Basic plan is free and is best for teams getting started with model serving or running variable workloads. The Pro plan requires contacting sales for a custom quote and is designed for teams with predictable high-volume inference needing reserved capacity. The Enterprise plan requires contacting sales for a custom quote and is designed for enterprises requiring data residency, custom slas, or on-prem deployments.
There are at least 1 documented hidden costs beyond Baseten's list price, including implementation, training, and add-on fees.
This pricing was last verified in May 5, 2026 from 2 independent sources.
Baseten is a model inference platform offering a free Basic plan with starter credits, plus custom-priced Pro and Enterprise tiers for production workloads. Token-based API pricing varies by model — across 6 models tracked by Artificial Analysis, median rates are $0.60 per million input tokens and $2.20 per million output tokens as of April 2026. Large frontier model deployments requiring dedicated GPU infrastructure are Enterprise-only and require a custom sales quote.
How Baseten Pricing Compares
Compare Baseten pricing against top alternatives in AI Model Hosting & Inference.
All Baseten Plans & Pricing
| Plan | Monthly | Annual | Best For |
|---|---|---|---|
| Basic gpuAccess: All standard GPU typesbilling: Per minute, pay-as-you-go | Free | Custom | Teams getting started with model serving or running variable workloads |
| Pro billing: Volume-based custom rates | Contact Sales | Contact Sales | Teams with predictable high-volume inference needing reserved capacity |
| Enterprise minimumCommitment: ~$5,000/month reported | Contact Sales | Contact Sales | Enterprises requiring data residency, custom SLAs, or on-prem deployments |
View all features by plan
Basic
- Pay-as-you-go GPU compute
- T4 GPU from $0.63/hour
- L4 GPU from $0.85/hour
- A10G GPU from $1.21/hour
- A100 80GB from $4.00/hour
- H100 80GB from $6.50/hour
- B200 180GB from $9.98/hour
- Model API (per-million-token billing)
- Dedicated deployments and training
- SOC 2 Type II and HIPAA compliant
- Email and in-app chat support
- Fast cold starts
- Autoscaling to zero
Pro
- Everything in Basic
- Priority access to high-demand GPUs
- Dedicated compute reservations
- Higher Model API rate limits
- Hands-on engineering expertise
- Dedicated support on Slack and Zoom
- Volume discounts on compute
Enterprise
- Everything in Pro
- Custom SLAs
- Self-host (VPC/on-prem) deployments
- On-demand flex compute
- Use existing cloud commitments (AWS/GCP credits)
- Full data residency control
- Advanced security and compliance
- Custom global regions
- Advanced RBAC with Teams
Usage-Based Rates
Per-unit pricing for Baseten API usage.
Basic
| Model | Input | Output | Cached | Per |
|---|---|---|---|---|
| deepseek-v4 | $1.74 | $3.48 | $0.145 | 1M tokens |
| deepseek-v3-1 | $0.500 | $1.50 | $0.250 | 1M tokens |
| kimi-k2-6 | $1.00 | $3.90 | $0.200 | 1M tokens |
| kimi-k2-5 | $0.600 | $3.00 | $0.120 | 1M tokens |
| nemotron-3-super | $0.300 | $0.750 | $0.060 | 1M tokens |
| minimax-m2-5 | $0.300 | $1.20 | $0.060 | 1M tokens |
| glm-5 | $0.950 | $3.15 | $0.200 | 1M tokens |
| glm-4-7 | $0.600 | $2.20 | $0.120 | 1M tokens |
| gpt-oss-120b | $0.100 | $0.500 | — | 1M tokens |
| Model / SKU | Unit | Price |
|---|---|---|
| t4-16gb | hour | $0.630 |
| l4-24gb | hour | $0.850 |
| a10g-24gb | hour | $1.21 |
| a100-80gb | hour | $4.00 |
| h100-mig-40gb | hour | $3.75 |
| h100-80gb | hour | $6.50 |
| b200-180gb | hour | $9.98 |
| cpu-1x2 | hour | $0.035 |
| cpu-1x4 | hour | $0.052 |
| cpu-2x8 | hour | $0.104 |
| cpu-4x16 | hour | $0.208 |
| cpu-8x32 | hour | $0.415 |
| cpu-16x64 | hour | $0.829 |
- GPU and CPU instances billed per minute (fractions rounded up)
- Model API priced per 1M tokens with cached input discount
- No idle charges when deployment scales to zero
- Volume discounts available on dedicated deployments and training
Compare Baseten vs Alternatives
Before committing to Baseten, compare pricing with these 3 alternatives in the same category.
What Companies Actually Pay for Baseten
| Model | Input /1M | Output /1M | Blended /1M |
|---|---|---|---|
| baseten_glm-5 | $0.950 | $3.15 | $1.50 |
| baseten_glm-5-non-reasoning | $0.950 | $3.15 | $1.50 |
| baseten_glm-4-7 | $0.600 | $2.20 | $1.00 |
| baseten_glm-4-7-non-reasoning | $0.600 | $2.20 | $1.00 |
| baseten_deepseek-v3-1_fp8 | $0.500 | $1.50 | $0.750 |
Baseten Year 1 Total Cost by Company Size
Real deployment costs including licenses, implementation, training, and admin — not just the sticker price.
Hosting a large frontier model such as DeepSeek R1 that requires multiple H200 GPUs via Baseten's Enterprise plan. Requires a custom sales quote; no public pricing available.
Reddit community estimate (r/Clojurescript, 2025-01-22)
How Baseten Pricing Compares
| Software | Starting Price | Top Price |
|---|---|---|
| Baseten | Custom | Custom |
| Banana.dev | Custom | Custom |
| BentoML | Free | $5000/month |
| Cerebrium | Free | $100/month |
Detailed pricing comparisons:
How to Negotiate Baseten Pricing
Baseten contracts are negotiable. These 1 tactics are sourced from real buyer experiences and procurement specialists.
For frontier-scale models requiring dedicated GPU infrastructure (e.g. DeepSeek R1, large parameter models needing multiple H200s), Baseten publishes no pricing — only a 'contact sales' CTA. Reach out early in your evaluation to get a custom quote and negotiate volume commitments in exchange for cost guarantees or reserved capacity.
Reddit (r/Clojurescript, 2025-01-22)Baseten Pricing FAQ
01 How much does Baseten cost?
Baseten uses pay-as-you-go GPU pricing billed per minute. T4 GPUs start at $0.63/hour, A10G at $1.21/hour, A100 (80GB) at $4.00/hour, H100 at $6.50/hour, and B200 at $9.98/hour. The Basic plan has no monthly minimum. Pro and Enterprise offer volume discounts.
02 Does Baseten have a free tier?
New Baseten accounts receive starter credits to explore deployments at no initial cost. There is no permanently free tier — ongoing usage is pay-as-you-go or under a Pro/Enterprise contract.
03 How does Baseten billing work?
Baseten bills per minute for dedicated GPU deployments, meaning you only pay when your model is running. Model API usage (for supported open-source models) is billed per million tokens processed. There are no idle charges when deployments are scaled to zero.
04 What GPUs does Baseten support?
Baseten supports T4, L4, A10G, A100 (80GB), H100 MIG (40GB), H100 (80GB), and B200 (180GB) GPUs. GPU availability varies by plan tier, with H100 and B200 accessible on all plans at published rates.
05 Does Baseten offer a fixed monthly pricing plan?
No. Baseten operates on a pay-per-use model — there is no fixed monthly cap. The Basic plan provides starter credits at no initial cost, while Pro and Enterprise are custom-priced based on usage and infrastructure requirements. All compute is metered.
06 How much does it cost to host large models like DeepSeek R1 on Baseten?
Large frontier models requiring multiple H200 GPUs are priced via custom Enterprise agreements only — no public rates are listed. Community estimates suggest such deployments can cost hundreds of thousands of dollars per year. Contact Baseten sales for a formal quote.
07 What is Baseten's typical per-token pricing?
Based on Artificial Analysis data (April 2026), Baseten's median pricing across 6 tracked models is $0.60 per million input tokens and $2.20 per million output tokens. Individual model prices range from $0.10/1M input (gpt-oss-120b-low) to $0.95/1M input (GLM-5 at $3.15/1M output).
Is this pricing incorrect? — we'll verify and update it.