Vidu Q3 Pro đã ra mắt — dùng thử ngay

Models

New5% off
anthropicanthropic/claude-opus-4.7
Input$5.0/Mt$4.8/Mt
Output$25.0/Mt$23.8/Mt
Context1,000,000
LLMVisionTool Use
New5% off
anthropicanthropic/claude-opus-4.6
Input$5.0/Mt$4.8/Mt
Output$25.0/Mt$23.8/Mt
Context1,000,000
LLMVisionTool Use
New5% off
anthropicanthropic/claude-sonnet-4.6
Input$3.0/Mt$2.8/Mt
Output$15.0/Mt$14.3/Mt
Context1,000,000
LLMVisionTool Use
New5% off
anthropicanthropic/claude-opus-4.5
Input$5.0/Mt$4.8/Mt
Output$25.0/Mt$23.8/Mt
Context200,000
LLMVisionTool Use
New5% off
anthropicanthropic/claude-sonnet-4.5
Input$3.0/Mt$2.8/Mt
Output$15.0/Mt$14.3/Mt
Context1,000,000
LLMVisionTool Use
New5% off
anthropicanthropic/claude-haiku-4.5
Input$1.0/Mt$0.95/Mt
Output$5.0/Mt$4.8/Mt
Context200,000
LLMVisionTool Use
New
anthropicanthropic/claude-sonnet-4
Input$3.0/Mt
Output$15.0/Mt
Context200,000
LLMVisionTool Use
New
anthropicanthropic/claude-3.5-haiku
Input$0.80/Mt
Output$4.0/Mt
Context200,000
LLMVisionTool Use
NewTiered pricing
openaiopenai/gpt-5.5
Input$5.0/Mt
Output$30.0/Mt
Context1,050,000
LLMVisionTool Use
NewTiered pricing
openaiopenai/gpt-5.4-pro
Input$30.0/Mt
Output$180.0/Mt
Context1,050,000
LLMVisionTool Use
NewTiered pricing
openaiopenai/gpt-5.4
Input$2.5/Mt
Output$15.0/Mt
Context1,050,000
LLMVisionTool Use
New
openaiopenai/gpt-5.3-chat
Input$1.8/Mt
Output$14.0/Mt
Context128,000
LLMVisionTool Use
New
openaiopenai/gpt-5.4-nano
Input$0.20/Mt
Output$1.3/Mt
Context400,000
LLMVisionTool Use
New
openaiopenai/gpt-5.4-mini
Input$0.75/Mt
Output$4.5/Mt
Context400,000
LLMVisionTool Use
New
openaiopenai/gpt-5.1
Input$1.3/Mt
Output$10.0/Mt
Context400,000
LLMVisionTool Use
NewTiered pricing
googlegoogle/gemini-3.1-pro-preview
Input$2.0/Mt
Output$12.0/Mt
Context1,048,576
LLMVisionAudio InTool Use
New
googlegoogle/gemini-3.1-flash-lite-preview
Input$0.25/Mt
Output$1.5/Mt
Context1,048,576
LLMVisionAudio InTool Use
New
googlegoogle/gemini-3-pro-image-preview
Input$2.0/Mt
Output$12.0/Mt
Context65,536
LLMVisionImage Out
New
googlegoogle/gemini-2.5-flash
Input$0.30/Mt
Output$2.5/Mt
Context1,048,576
LLMVisionAudio InTool Use
NewTiered pricing
googlegoogle/gemini-2.5-pro
Input$1.3/Mt
Output$10.0/Mt
Context1,048,576
LLMVisionAudio InTool Use
New
chatglmz-ai/glm-5.1
Input$1.4/Mt
Output$4.4/Mt
Context202,752
LLMTool Use
New
deepseekdeepseek/deepseek-v4-flash
Input$0.17/Mt
Output$0.34/Mt
Context1,048,576
LLMTool Use
New
deepseekdeepseek/deepseek-v4-pro
Input$1.8/Mt
Output$3.7/Mt
Context1,048,576
LLMTool Use
New
deepseekdeepseek/deepseek-v3.2
Input$0.26/Mt
Output$0.38/Mt
Context163,840
LLMTool Use
New
minimaxminimax/minimax-m2.7
Input$0.30/Mt
Output$1.2/Mt
Context204,800
LLMTool Use
New
moonshotmoonshotai/kimi-k2.5
Input$0.60/Mt
Output$3.0/Mt
Context262,144
LLMVisionTool Use

LLM API — Access 290+ AI Models

Compare pricing, speed, and performance for GPT-5.5, Claude Opus 4.7, Gemini 3, Qwen 3, DeepSeek R1, Llama 4, Grok 4, and more. Unified OpenAI-compatible API with no cold starts and transparent per-token pricing.

Why Choose WaveSpeedAI for LLMs

290+ Models

GPT, Claude, Gemini, Qwen, DeepSeek, Llama, Grok, Mistral — all in one unified API.

OpenAI Compatible

Drop-in replacement for OpenAI SDK. Switch models with one line of code.

No Cold Starts

Models are always warm. First-token latency measured in milliseconds.

Pay Per Token

Transparent pricing with no subscriptions. Only pay for what you use.

Frequently Asked Questions

How does pricing work?+

You pay per token — input and output tokens are priced separately. No subscriptions, no minimum commitments. Check the pricing table above for per-model rates.

Is the API compatible with OpenAI?+

Yes. Our API is fully OpenAI-compatible. Use the OpenAI SDK and just change the base URL and API key.

What models are available?+

We offer 290+ models from 30+ providers including OpenAI GPT-5.5 & GPT-5.4, Anthropic Claude Opus 4.7, Google Gemini 3, Qwen 3, DeepSeek R1 & V3, Meta Llama 4, xAI Grok 4, Mistral, and many more.

Are there rate limits?+

Rate limits depend on your plan. Free tier includes generous limits for testing. Paid plans offer higher throughput.