Baseten Pricing 2026: GPU Inference from $0.63/hour

Price checkMonthly

Quick Answer

Last verified: May 5, 2026

Medium confidence

Baseten uses custom pricing as of May 2026 with 3 plans available. Contact Baseten directly for a personalized quote. Plan: Basic (free). Enterprise pricing is available on request. Pricing depends on your chosen tier, contract length, and negotiated discounts.

Use the interactive pricing calculator to estimate your exact cost based on team size and requirements.

Free tier: Yes

Baseten offers 3 pricing tiers: Basic, Pro, Enterprise. The Pro plan is teams with predictable high-volume inference needing reserved capacity.

Compared to other ai model hosting & inference software, Baseten is positioned at the budget-friendly price point.

1 documented hidden costs beyond list price

How much does Baseten cost?

Baseten uses custom pricing across 3 plans. Contact Baseten directly for a personalized quote. Plans include Basic (free), Pro (custom pricing), Enterprise (custom pricing).

Baseten Pricing Overview

Baseten uses custom pricing — contact their sales team for a quote. The Basic plan is free and is best for teams getting started with model serving or running variable workloads. The Pro plan requires contacting sales for a custom quote and is designed for teams with predictable high-volume inference needing reserved capacity. The Enterprise plan requires contacting sales for a custom quote and is designed for enterprises requiring data residency, custom slas, or on-prem deployments.

There are at least 1 documented hidden costs beyond Baseten's list price, including implementation, training, and add-on fees.

This pricing was last verified in May 5, 2026 from 2 independent sources.

Start Baseten Free Trial

Baseten is a model inference platform offering a free Basic plan with starter credits, plus custom-priced Pro and Enterprise tiers for production workloads. Token-based API pricing varies by model — across 6 models tracked by Artificial Analysis, median rates are $0.60 per million input tokens and $2.20 per million output tokens as of April 2026. Large frontier model deployments requiring dedicated GPU infrastructure are Enterprise-only and require a custom sales quote.

How Baseten Pricing Compares

Compare Baseten pricing against top alternatives in AI Model Hosting & Inference.

BentoML $0-$5000/month Compare → Cerebrium $0-$100/month Compare → Banana.dev $0-$0/month Compare →

All Baseten Plans & Pricing

Plan	Monthly	Annual	Best For
Basic gpuAccess: All standard GPU typesbilling: Per minute, pay-as-you-go	Free	Custom	Teams getting started with model serving or running variable workloads
Pro billing: Volume-based custom rates	Contact Sales	Contact Sales	Teams with predictable high-volume inference needing reserved capacity
Enterprise minimumCommitment: ~$5,000/month reported	Contact Sales	Contact Sales	Enterprises requiring data residency, custom SLAs, or on-prem deployments

View all features by plan

Basic

Pay-as-you-go GPU compute
T4 GPU from $0.63/hour
L4 GPU from $0.85/hour
A10G GPU from $1.21/hour
A100 80GB from $4.00/hour
H100 80GB from $6.50/hour
B200 180GB from $9.98/hour
Model API (per-million-token billing)
Dedicated deployments and training
SOC 2 Type II and HIPAA compliant
Email and in-app chat support
Fast cold starts
Autoscaling to zero

Pro

Everything in Basic
Priority access to high-demand GPUs
Dedicated compute reservations
Higher Model API rate limits
Hands-on engineering expertise
Dedicated support on Slack and Zoom
Volume discounts on compute

Enterprise

Everything in Pro
Custom SLAs
Self-host (VPC/on-prem) deployments
On-demand flex compute
Use existing cloud commitments (AWS/GCP credits)
Full data residency control
Advanced security and compliance
Custom global regions
Advanced RBAC with Teams

Start Baseten Free Trial

Usage-Based Rates

Per-unit pricing for Baseten API usage.

Basic

Model	Input	Output	Cached	Per
deepseek-v4	$1.74	$3.48	$0.145	1M tokens
deepseek-v3-1	$0.500	$1.50	$0.250	1M tokens
kimi-k2-6	$1.00	$3.90	$0.200	1M tokens
kimi-k2-5	$0.600	$3.00	$0.120	1M tokens
nemotron-3-super	$0.300	$0.750	$0.060	1M tokens
minimax-m2-5	$0.300	$1.20	$0.060	1M tokens
glm-5	$0.950	$3.15	$0.200	1M tokens
glm-4-7	$0.600	$2.20	$0.120	1M tokens
gpt-oss-120b	$0.100	$0.500	—	1M tokens

Model / SKU	Unit	Price
t4-16gb	hour	$0.630
l4-24gb	hour	$0.850
a10g-24gb	hour	$1.21
a100-80gb	hour	$4.00
h100-mig-40gb	hour	$3.75
h100-80gb	hour	$6.50
b200-180gb	hour	$9.98
cpu-1x2	hour	$0.035
cpu-1x4	hour	$0.052
cpu-2x8	hour	$0.104
cpu-4x16	hour	$0.208
cpu-8x32	hour	$0.415
cpu-16x64	hour	$0.829

GPU and CPU instances billed per minute (fractions rounded up)
Model API priced per 1M tokens with cached input discount
No idle charges when deployment scales to zero
Volume discounts available on dedicated deployments and training

Compare Baseten vs Alternatives

Before committing to Baseten, compare pricing with these 3 alternatives in the same category.

VSBentoML

Free

Individual developers and small teams building AI-powered APIs

Full comparison

VSCerebrium

Free

Individual developers and hobbyists experimenting with serverless ML inference

Full comparison

VSBanana.dev

Free

Historical reference only — service is no longer available

Full comparison

All Baseten alternatives & migration guides

What Companies Actually Pay for Baseten

Median per-1M-token pricing across 6 models

Input $0.600/1M

Output $2.20/1M

Flagship models in this provider's catalog

Model	Input /1M	Output /1M	Blended /1M
baseten_glm-5	$0.950	$3.15	$1.50
baseten_glm-5-non-reasoning	$0.950	$3.15	$1.50
baseten_glm-4-7	$0.600	$2.20	$1.00
baseten_glm-4-7-non-reasoning	$0.600	$2.20	$1.00
baseten_deepseek-v3-1_fp8	$0.500	$1.50	$0.750

Review scores

Top pricing complaints

Large model pricing requires contacting sales with no transparent rates publishedNo fixed monthly pricing — all compute costs are metered and variable

Source: Artificial Analysis — medians aggregated from 6 models in this provider's catalog. Per-1M-token pricing reflects list rates.

Baseten Year 1 Total Cost by Company Size

Real deployment costs including licenses, implementation, training, and admin — not just the sticker price.

Enterprise Frontier Model Hosting (H200-scale) Estimated $100,000–$500,000+/year (community estimate; custom quote required) Year 1 total

community estimate; custom quote required

Total Estimated $100,000–$500,000+/year (community estimate; custom quote required)

Hosting a large frontier model such as DeepSeek R1 that requires multiple H200 GPUs via Baseten's Enterprise plan. Requires a custom sales quote; no public pricing available.

Reddit community estimate (r/Clojurescript, 2025-01-22)

How Baseten Pricing Compares

Software	Starting Price	Top Price
Baseten	Custom	Custom
Banana.dev	Custom	Custom
BentoML	Free	$5000/month
Cerebrium	Free	$100/month

Detailed pricing comparisons:

Browse all AI Model Hosting & Inference pricing →

1 Baseten Hidden Costs Beyond the List Price

Beyond the listed price, Baseten has at least 1 documented hidden costs that can significantly increase total cost of ownership.

Watch for 1 hidden costs

GPU Infrastructure Costs for Large-Scale Model Deployments $100,000-$500,000
critical 1 source

Reddit "since it requires many H200's, I'm guessing the cost is in the multiple hundreds of thousands per year"
Reddit "the pricing on their website (https://www.baseten.co/library/deepseek-r1) just has a "call sales" button, which is never a good sign"

Tip

Ask your Baseten sales rep about these costs upfront. Getting them in writing before signing can save you from surprise charges later.

Full hidden costs breakdown →

Intelligence sourced from 1 independent sources

Reddit User discussions

Key claims include inline source attribution. Data verified against multiple independent sources. 4 source citations total.

How to Negotiate Baseten Pricing

Baseten contracts are negotiable. These 1 tactics are sourced from real buyer experiences and procurement specialists.

Negotiation Playbook 1 tactics

Engage Sales Early for Large Model Deployments medium success

For frontier-scale models requiring dedicated GPU infrastructure (e.g. DeepSeek R1, large parameter models needing multiple H200s), Baseten publishes no pricing — only a 'contact sales' CTA. Reach out early in your evaluation to get a custom quote and negotiate volume commitments in exchange for cost guarantees or reserved capacity.

Reddit (r/Clojurescript, 2025-01-22)

Full negotiation guide →

Baseten Pricing FAQ

01 How much does Baseten cost?

Baseten uses pay-as-you-go GPU pricing billed per minute. T4 GPUs start at $0.63/hour, A10G at $1.21/hour, A100 (80GB) at $4.00/hour, H100 at $6.50/hour, and B200 at $9.98/hour. The Basic plan has no monthly minimum. Pro and Enterprise offer volume discounts.

02 Does Baseten have a free tier?

New Baseten accounts receive starter credits to explore deployments at no initial cost. There is no permanently free tier — ongoing usage is pay-as-you-go or under a Pro/Enterprise contract.

03 How does Baseten billing work?

Baseten bills per minute for dedicated GPU deployments, meaning you only pay when your model is running. Model API usage (for supported open-source models) is billed per million tokens processed. There are no idle charges when deployments are scaled to zero.

04 What GPUs does Baseten support?

Baseten supports T4, L4, A10G, A100 (80GB), H100 MIG (40GB), H100 (80GB), and B200 (180GB) GPUs. GPU availability varies by plan tier, with H100 and B200 accessible on all plans at published rates.

05 Does Baseten offer a fixed monthly pricing plan?

No. Baseten operates on a pay-per-use model — there is no fixed monthly cap. The Basic plan provides starter credits at no initial cost, while Pro and Enterprise are custom-priced based on usage and infrastructure requirements. All compute is metered.

06 How much does it cost to host large models like DeepSeek R1 on Baseten?

Large frontier models requiring multiple H200 GPUs are priced via custom Enterprise agreements only — no public rates are listed. Community estimates suggest such deployments can cost hundreds of thousands of dollars per year. Contact Baseten sales for a formal quote.

07 What is Baseten's typical per-token pricing?

Based on Artificial Analysis data (April 2026), Baseten's median pricing across 6 tracked models is $0.60 per million input tokens and $2.20 per million output tokens. Individual model prices range from $0.10/1M input (gpt-oss-120b-low) to $0.95/1M input (GLM-5 at $3.15/1M output).

Is this pricing incorrect? — we'll verify and update it.