AI API Pricing Comparison 2026 — OpenAI vs Claude vs Gemini Costs

What Are Tokens?

Think of tokens as small chunks of text. A token is roughly ¾ of a word — so 1,000 tokens is about 750 words (roughly a page and a half of text).

When you send a message to an AI API, you pay for two things:

Input tokens — the text you send (your prompt, context, documents)
Output tokens — the text the AI writes back

Output tokens cost more because the AI is generating new text, which takes more compute. Pricing is listed per 1 million tokens (1M).

A tweet~40 tokens

An email~200 tokens

A blog post~1,500 tokens

A novel~100,000 tokens

Pricing

Cost Per 1M Tokens

All prices in USD. Updated March 2026.

OpenAI

Model	Input	Output	Context	Best For
GPT-4.1	$2.00	$8.00	1M	Coding, long documents, instruction-following
GPT-4o	$2.50	$10.00	128K	General-purpose flagship, multimodal (text + image + audio)
GPT-4.1-mini	$0.40	$1.60	1M	Budget-friendly coding & analysis with long context
GPT-4o-mini	$0.15	$0.60	128K	High-volume lightweight tasks, classification
GPT-4.1-nano	$0.10	$0.40	1M	Ultra-cheap extraction, routing, high-volume ops
o3	$2.00	$8.00	200K	Advanced reasoning — math, science, multi-step logic
o4-mini	$1.10	$4.40	200K	Budget reasoning for STEM and chain-of-thought tasks

Anthropic (Claude)

Model	Input	Output	Context	Best For
Claude Opus 4.6	$5.00	$25.00	200K	Most capable — complex analysis, agentic coding, deep research
Claude Sonnet 4.6	$3.00	$15.00	200K	Best balance of quality and cost — coding, tool use, production workloads
Claude Haiku 4.5	$1.00	$5.00	200K	Fast & affordable — real-time chat, classification, simple tasks
Claude Haiku 3.5 (legacy)	$0.80	$4.00	200K	Budget option — still available, slightly cheaper than Haiku 4.5

Google (Gemini)

Model	Input	Output	Context	Best For
Gemini 2.5 Pro	$1.25	$10.00	1M	Advanced reasoning at a great price — coding, multimodal analysis
Gemini 2.5 Flash	$0.30	$2.50	1M	Fast reasoning on a budget — great value mid-tier
Gemini 2.0 Flash	$0.10	$0.40	1M	Cheapest 1M-context model — high-volume, real-time apps

Recommendations

Which Model Should You Use?

Chatbot / Customer Support

High volume, fast responses, low cost per interaction.

Best: Gemini 2.0 FlashRunner-up: GPT-4o-mini

At $0.10/1M input tokens, Gemini 2.0 Flash handles thousands of conversations for pennies. GPT-4o-mini is comparable. Both are fast and cheap.

Code Generation & Dev Tools

Needs strong reasoning and code quality.

Best: Claude Sonnet 4Runner-up: GPT-4.1

Claude Sonnet consistently ranks highest on coding benchmarks. GPT-4.1 is a close second with a 1M context window — great for large codebases.

Document Processing & Summarization

Large inputs, moderate outputs — context window matters.

Best: Gemini 2.5 FlashRunner-up: GPT-4.1-mini

Both offer 1M context windows at low prices. Gemini 2.5 Flash at $0.30/1M input is hard to beat for ingesting long documents.

Complex Reasoning & Research

Multi-step logic, analysis, math, science.

Best: o3Runner-up: Gemini 2.5 Pro

OpenAI's o3 excels at chain-of-thought reasoning. Gemini 2.5 Pro offers similar capability at a slightly lower price point.

High-Volume Classification & Extraction

Thousands of API calls — cost is everything.

Best: GPT-4.1-nanoRunner-up: Gemini 2.0 Flash

Both cost $0.10/1M input tokens — the cheapest in the industry. GPT-4.1-nano has a 1M context window for batch processing.

Creative Writing & Content

Natural, nuanced prose with good instruction following.

Best: Claude Sonnet 4Runner-up: GPT-4o

Claude is known for natural, well-structured writing. GPT-4o is multimodal and handles mixed-media content well.

Calculator

Estimate Your API Costs

Plug in your expected usage and see what each model will cost you per day and per month.

Input tokens per request~75,000 words

Output tokens per request~15,000 words

Requests per day3,000 / month

ModelDaily CostMonthly Cost

OpenAIGPT-4.1-nano$1.80$54.00

GoogleGemini 2.0 Flash$1.80$54.00

OpenAIGPT-4o-mini$2.70$81.00

OpenAIGPT-4.1-mini$7.20$216.00

GoogleGemini 2.5 Flash$8.00$240.00

AnthropicClaude Haiku 3.5$16.00$480.00

OpenAIo4-mini$19.80$594.00

AnthropicClaude Haiku 4.5$20.00$600.00

GoogleGemini 2.5 Pro$32.50$975.00

OpenAIGPT-4.1$36.00$1080.00

OpenAIo3$36.00$1080.00

OpenAIGPT-4o$45.00$1350.00

AnthropicClaude Sonnet 4.6$60.00$1800.00

AnthropicClaude Opus 4.6$100.00$3000.00

Key Takeaways

Cheapest overall: Gemini 2.0 Flash and GPT-4.1-nano tie at $0.10 / 1M input tokens. Perfect for high-volume, simple tasks.
Best value mid-tier: Gemini 2.5 Flash ($0.30 input / $2.50 output) gives you reasoning capabilities at a great price.
Best for coding: Claude Sonnet 4.6 leads on code benchmarks. GPT-4.1's 1M context is ideal for large projects.
Most capable: Claude Opus 4.6 at $5 / $25 — the most capable model available, now 3x cheaper than its predecessor.
Context window matters: If you need to process long documents, the GPT-4.1 family and Gemini models offer 1M tokens natively.
Output costs add up: Output tokens cost 3-5x more than input. If your app generates long responses, that's where most of your spend goes.

Need help choosing the right AI model?

We engineer AI-first applications and can help you pick the right model — and the right architecture — for your use case.

Start a Project

AI API PricingCompared