Quick Answer
Last verified:
Medium confidence

Baseten uses custom pricing as of May 2026 with 3 plans available. Contact Baseten directly for a personalized quote. Plan: Basic (free). Enterprise pricing is available on request. Pricing depends on your chosen tier, contract length, and negotiated discounts.

Use the interactive pricing calculator to estimate your exact cost based on team size and requirements.

  • Free tier: Yes

Baseten offers 3 pricing tiers: Basic, Pro, Enterprise. The Pro plan is teams with predictable high-volume inference needing reserved capacity.

Compared to other ai model hosting & inference software, Baseten is positioned at the budget-friendly price point.

  • 1 documented hidden costs beyond list price

How much does Baseten cost?

Baseten uses custom pricing across 3 plans. Contact Baseten directly for a personalized quote. Plans include Basic (free), Pro (custom pricing), Enterprise (custom pricing).

Baseten Pricing Overview

Baseten uses custom pricing — contact their sales team for a quote. The Basic plan is free and is best for teams getting started with model serving or running variable workloads. The Pro plan requires contacting sales for a custom quote and is designed for teams with predictable high-volume inference needing reserved capacity. The Enterprise plan requires contacting sales for a custom quote and is designed for enterprises requiring data residency, custom slas, or on-prem deployments.

There are at least 1 documented hidden costs beyond Baseten's list price, including implementation, training, and add-on fees.

This pricing was last verified in May 5, 2026 from 2 independent sources.

Baseten is a model inference platform offering a free Basic plan with starter credits, plus custom-priced Pro and Enterprise tiers for production workloads. Token-based API pricing varies by model — across 6 models tracked by Artificial Analysis, median rates are $0.60 per million input tokens and $2.20 per million output tokens as of April 2026. Large frontier model deployments requiring dedicated GPU infrastructure are Enterprise-only and require a custom sales quote.

How Baseten Pricing Compares

Compare Baseten pricing against top alternatives in AI Model Hosting & Inference.

All Baseten Plans & Pricing

Plan Monthly Annual Best For
Basic gpuAccess: All standard GPU typesbilling: Per minute, pay-as-you-go Free Custom Teams getting started with model serving or running variable workloads
Pro billing: Volume-based custom rates Contact Sales Contact Sales Teams with predictable high-volume inference needing reserved capacity
Enterprise minimumCommitment: ~$5,000/month reported Contact Sales Contact Sales Enterprises requiring data residency, custom SLAs, or on-prem deployments
View all features by plan

Basic

  • Pay-as-you-go GPU compute
  • T4 GPU from $0.63/hour
  • L4 GPU from $0.85/hour
  • A10G GPU from $1.21/hour
  • A100 80GB from $4.00/hour
  • H100 80GB from $6.50/hour
  • B200 180GB from $9.98/hour
  • Model API (per-million-token billing)
  • Dedicated deployments and training
  • SOC 2 Type II and HIPAA compliant
  • Email and in-app chat support
  • Fast cold starts
  • Autoscaling to zero

Pro

  • Everything in Basic
  • Priority access to high-demand GPUs
  • Dedicated compute reservations
  • Higher Model API rate limits
  • Hands-on engineering expertise
  • Dedicated support on Slack and Zoom
  • Volume discounts on compute

Enterprise

  • Everything in Pro
  • Custom SLAs
  • Self-host (VPC/on-prem) deployments
  • On-demand flex compute
  • Use existing cloud commitments (AWS/GCP credits)
  • Full data residency control
  • Advanced security and compliance
  • Custom global regions
  • Advanced RBAC with Teams

Usage-Based Rates

Per-unit pricing for Baseten API usage.

Basic

Model Input Output Cached Per
deepseek-v4 $1.74 $3.48 $0.145 1M tokens
deepseek-v3-1 $0.500 $1.50 $0.250 1M tokens
kimi-k2-6 $1.00 $3.90 $0.200 1M tokens
kimi-k2-5 $0.600 $3.00 $0.120 1M tokens
nemotron-3-super $0.300 $0.750 $0.060 1M tokens
minimax-m2-5 $0.300 $1.20 $0.060 1M tokens
glm-5 $0.950 $3.15 $0.200 1M tokens
glm-4-7 $0.600 $2.20 $0.120 1M tokens
gpt-oss-120b $0.100 $0.500 1M tokens
Model / SKU Unit Price
t4-16gb hour $0.630
l4-24gb hour $0.850
a10g-24gb hour $1.21
a100-80gb hour $4.00
h100-mig-40gb hour $3.75
h100-80gb hour $6.50
b200-180gb hour $9.98
cpu-1x2 hour $0.035
cpu-1x4 hour $0.052
cpu-2x8 hour $0.104
cpu-4x16 hour $0.208
cpu-8x32 hour $0.415
cpu-16x64 hour $0.829
  • GPU and CPU instances billed per minute (fractions rounded up)
  • Model API priced per 1M tokens with cached input discount
  • No idle charges when deployment scales to zero
  • Volume discounts available on dedicated deployments and training

Compare Baseten vs Alternatives

Before committing to Baseten, compare pricing with these 3 alternatives in the same category.

All Baseten alternatives & migration guides

What Companies Actually Pay for Baseten

Median per-1M-token pricing across 6 models
Input $0.600/1M
Output $2.20/1M
Flagship models in this provider's catalog
Model Input /1M Output /1M Blended /1M
baseten_glm-5 $0.950 $3.15 $1.50
baseten_glm-5-non-reasoning $0.950 $3.15 $1.50
baseten_glm-4-7 $0.600 $2.20 $1.00
baseten_glm-4-7-non-reasoning $0.600 $2.20 $1.00
baseten_deepseek-v3-1_fp8 $0.500 $1.50 $0.750
Review scores
Top pricing complaints
Large model pricing requires contacting sales with no transparent rates publishedNo fixed monthly pricing — all compute costs are metered and variable
Source: Artificial Analysis — medians aggregated from 6 models in this provider's catalog. Per-1M-token pricing reflects list rates.

Baseten Year 1 Total Cost by Company Size

Real deployment costs including licenses, implementation, training, and admin — not just the sticker price.

Enterprise Frontier Model Hosting (H200-scale) Estimated $100,000–$500,000+/year (community estimate; custom quote required) Year 1 total
community estimate; custom quote required
Total Estimated $100,000–$500,000+/year (community estimate; custom quote required)

Hosting a large frontier model such as DeepSeek R1 that requires multiple H200 GPUs via Baseten's Enterprise plan. Requires a custom sales quote; no public pricing available.

Reddit community estimate (r/Clojurescript, 2025-01-22)

How Baseten Pricing Compares

Software Starting Price Top Price
Baseten Custom Custom
Banana.dev Custom Custom
BentoML Free $5000/month
Cerebrium Free $100/month

1 Baseten Hidden Costs Beyond the List Price

Beyond the listed price, Baseten has at least 1 documented hidden costs that can significantly increase total cost of ownership.

Watch for 1 hidden costs
  • GPU Infrastructure Costs for Large-Scale Model Deployments $100,000-$500,000
    critical 1 source
    Reddit "since it requires many H200's, I'm guessing the cost is in the multiple hundreds of thousands per year"
    Reddit "the pricing on their website (https://www.baseten.co/library/deepseek-r1) just has a "call sales" button, which is never a good sign"
Tip

Ask your Baseten sales rep about these costs upfront. Getting them in writing before signing can save you from surprise charges later.

Full hidden costs breakdown →

Intelligence sourced from 1 independent sources
Reddit User discussions
Key claims include inline source attribution. Data verified against multiple independent sources. 4 source citations total.

How to Negotiate Baseten Pricing

Baseten contracts are negotiable. These 1 tactics are sourced from real buyer experiences and procurement specialists.

Negotiation Playbook 1 tactics
Engage Sales Early for Large Model Deployments medium success

For frontier-scale models requiring dedicated GPU infrastructure (e.g. DeepSeek R1, large parameter models needing multiple H200s), Baseten publishes no pricing — only a 'contact sales' CTA. Reach out early in your evaluation to get a custom quote and negotiate volume commitments in exchange for cost guarantees or reserved capacity.

Reddit (r/Clojurescript, 2025-01-22)

Full negotiation guide →

Baseten Pricing FAQ

01 How much does Baseten cost?

Baseten uses pay-as-you-go GPU pricing billed per minute. T4 GPUs start at $0.63/hour, A10G at $1.21/hour, A100 (80GB) at $4.00/hour, H100 at $6.50/hour, and B200 at $9.98/hour. The Basic plan has no monthly minimum. Pro and Enterprise offer volume discounts.

02 Does Baseten have a free tier?

New Baseten accounts receive starter credits to explore deployments at no initial cost. There is no permanently free tier — ongoing usage is pay-as-you-go or under a Pro/Enterprise contract.

03 How does Baseten billing work?

Baseten bills per minute for dedicated GPU deployments, meaning you only pay when your model is running. Model API usage (for supported open-source models) is billed per million tokens processed. There are no idle charges when deployments are scaled to zero.

04 What GPUs does Baseten support?

Baseten supports T4, L4, A10G, A100 (80GB), H100 MIG (40GB), H100 (80GB), and B200 (180GB) GPUs. GPU availability varies by plan tier, with H100 and B200 accessible on all plans at published rates.

05 Does Baseten offer a fixed monthly pricing plan?

No. Baseten operates on a pay-per-use model — there is no fixed monthly cap. The Basic plan provides starter credits at no initial cost, while Pro and Enterprise are custom-priced based on usage and infrastructure requirements. All compute is metered.

06 How much does it cost to host large models like DeepSeek R1 on Baseten?

Large frontier models requiring multiple H200 GPUs are priced via custom Enterprise agreements only — no public rates are listed. Community estimates suggest such deployments can cost hundreds of thousands of dollars per year. Contact Baseten sales for a formal quote.

07 What is Baseten's typical per-token pricing?

Based on Artificial Analysis data (April 2026), Baseten's median pricing across 6 tracked models is $0.60 per million input tokens and $2.20 per million output tokens. Individual model prices range from $0.10/1M input (gpt-oss-120b-low) to $0.95/1M input (GLM-5 at $3.15/1M output).

Is this pricing incorrect? — we'll verify and update it.