OpenAI generation models are now available, GPT image 2.0 is online now! — smarter, more powerful tools for AI image and video creation on WaveSpeedAI

Catalog

Sora-2 / Image-to-Video — Add motion to a single image with physics-aware dynamics and stable identities.

Sora-2 / Image-to-Video Pro — Higher fidelity and longer, smoother camera language for editorial or production shots.

Sora-2 / Text-to-Video — Generate scenes directly from text prompts; strong temporal consistency.

Sora-2 / Text-to-Video Pro — Pro-grade steerability and long-range coherence for complex sequences.

GPT-Image-1 / Text-to-Image — Fast, prompt-faithful images with editability and tool-friendly outputs.

DALL·E 3 — Clean composition and rich detail for concepting and illustration.

DALL·E 2 — Lightweight text-to-image for quick drafts and style exploration.

Sora (legacy) — Earlier Sora generation for baseline motion tests and rapid previews.

Openai-whisper — High-accuracy multilingual speech recognition model for precise transcription with automatic language detection and punctuation.

Openai-whisper-turbo — Optimized Whisper variant delivering the same accuracy with significantly faster transcription speed for real-time and large-scale use.

Openai/gpt-image-1-mini/text-to-image generates high-quality images directly from text prompts with GPT-5-level understanding and efficiency, ideal for creative and design tasks.

Openai/gpt-image-1-mini/edit enables intelligent image editing and refinement via natural-language instructions, preserving style and composition while applying precise changes.

Openai/gpt-image-1-high-fidelity delivers ultra-detailed, photorealistic image generation powered by GPT-5, offering superior texture, lighting, and realism for professional-grade creative and design applications.

Openai/gpt-image-1.5/text-to-image generates high-quality images from natural-language prompts with cost-efficient performance, producing coherent composition and clean aesthetics for UI concepts, marketing visuals, and fast creative ideation.

Openai/gpt-image-1.5/text-to-image delivers high-quality text-to-image generation with strong prompt understanding and optimized synthesis, enabling rapid iteration and scalable visual production for design, prototyping, and creative workflows.

Openai/gpt-image-2/edit enables high-fidelity image editing from natural-language instructions and reference images, preserving visual coherence, stylistic consistency, and fine-grained detail for marketing assets, design refinement, and fast creative iteration.

Openai/gpt-image-2/text-to-image delivers high-quality text-to-image generation with strong prompt adherence, clean composition, and polished aesthetics, enabling scalable visual creation for UI concepts, campaign assets, and rapid creative ideation.

Why OpenAI Models?

State-of-the-art quality — Physics-aware video, synchronized audio, and high-fidelity images with strong prompt adherence.

End-to-end workflow — Text-to-image, image-to-video, and text-to-video in one stack; smooth handoff between models.

Pro-grade control — Seeds, duration/aspect, camera language, and edit ops for consistent, repeatable results.

Wide style range — From photoreal and documentary to anime, illustration, and cinematic looks—without plastic over-sharpening.

OpenAI Models

Cutting-edge OpenAI models across text, image, and multimodal creation—curated in one place. These models sit at the front line of generative AI, combining strong reasoning, cinematic rendering, and reliable performance for real-world workflows.

Catalog

Sora-2 / Image-to-Video — Add motion to a single image with physics-aware dynamics and stable identities.
Sora-2 / Image-to-Video Pro — Higher fidelity and longer, smoother camera language for editorial or production shots.
Sora-2 / Text-to-Video — Generate scenes directly from text prompts; strong temporal consistency.
Sora-2 / Text-to-Video Pro — Pro-grade steerability and long-range coherence for complex sequences.
GPT-Image-1 / Text-to-Image — Fast, prompt-faithful images with editability and tool-friendly outputs.
DALL·E 3 — Clean composition and rich detail for concepting and illustration.
DALL·E 2 — Lightweight text-to-image for quick drafts and style exploration.
Sora (legacy) — Earlier Sora generation for baseline motion tests and rapid previews.
Openai-whisper — High-accuracy multilingual speech recognition model for precise transcription with automatic language detection and punctuation.
Openai-whisper-turbo — Optimized Whisper variant delivering the same accuracy with significantly faster transcription speed for real-time and large-scale use.
Openai/gpt-image-1-mini/text-to-image generates high-quality images directly from text prompts with GPT-5-level understanding and efficiency, ideal for creative and design tasks.
Openai/gpt-image-1-mini/edit enables intelligent image editing and refinement via natural-language instructions, preserving style and composition while applying precise changes.
Openai/gpt-image-1-high-fidelity delivers ultra-detailed, photorealistic image generation powered by GPT-5, offering superior texture, lighting, and realism for professional-grade creative and design applications.
Openai/gpt-image-1.5/text-to-image generates high-quality images from natural-language prompts with cost-efficient performance, producing coherent composition and clean aesthetics for UI concepts, marketing visuals, and fast creative ideation.
Openai/gpt-image-1.5/text-to-image delivers high-quality text-to-image generation with strong prompt understanding and optimized synthesis, enabling rapid iteration and scalable visual production for design, prototyping, and creative workflows.
Openai/gpt-image-2/edit enables high-fidelity image editing from natural-language instructions and reference images, preserving visual coherence, stylistic consistency, and fine-grained detail for marketing assets, design refinement, and fast creative iteration.
Openai/gpt-image-2/text-to-image delivers high-quality text-to-image generation with strong prompt adherence, clean composition, and polished aesthetics, enabling scalable visual creation for UI concepts, campaign assets, and rapid creative ideation.

Why OpenAI Models?

State-of-the-art quality — Physics-aware video, synchronized audio, and high-fidelity images with strong prompt adherence.
End-to-end workflow — Text-to-image, image-to-video, and text-to-video in one stack; smooth handoff between models.
Pro-grade control — Seeds, duration/aspect, camera language, and edit ops for consistent, repeatable results.
Wide style range — From photoreal and documentary to anime, illustration, and cinematic looks—without plastic over-sharpening.

OpenAI Models

所有模型

openai/gpt-image-2/edit

openai/gpt-image-2/text-to-image

wavespeed-ai/openai-whisper-turbo

openai/gpt-image-1.5/text-to-image

openai/sora-2/image-to-video-pro

openai/sora-2/text-to-video

openai/sora-2/text-to-video-pro

wavespeed-ai/openai-whisper

openai/gpt-image-1

openai/gpt-image-1-mini/text-to-image

openai/gpt-image-1-mini/edit

openai/sora-2/image-to-video

openai/gpt-image-1.5/edit

wavespeed-ai/openai-whisper-with-video

openai/sora-2-pro/text-to-video

openai/sora-2-pro/image-to-video

openai/sora-2/characters

openai/gpt-image-1-high-fidelity

openai/gpt-image-1/text-to-image

OpenAI Models

Catalog

Why OpenAI Models?

OpenAI Models API — 價格與效能

為什麼在 WaveSpeedAI 上執行 OpenAI Models

透明的價格

為低延遲最佳化

99.9% 可用率

常見問題

探索 1,000 多種 AI 模型

使用 API 建構