
Brainstorm, generate, edit, and iterate faster across images and videos with WaveSpeedAI.

AI Video Converter converts videos between formats. Upload a video and specify the target format to get a converted result. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Video FPS Increaser doubles your video frame rate for smoother motion and better playback quality. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

FlashVSR is a fast, high-quality video upscaler that boosts resolution and restores clarity for low-resolution or blurry footage. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

WaveSpeedAI Video Outpainter expands any video beyond its original boundaries while preserving motion, identity, and scene coherence. Perfect for aspect-ratio changes, reframing, adding safe margins, or generating new visual context without cropping or losing content.

Video Body Swap replaces the body in a target video with your face. Upload a face image and a body video to get a seamless swap. Ready-to-use REST inference API, no coldstarts, affordable pricing.

WaveSpeed Video Background Remover replaces or removes video backgrounds with a custom image. Upload or paste a link to your video, then provide a background image by URL or file—clean matting, edge-aware blending, and natural compositing keep subjects realistic. Built for creator workflows and batch jobs. Ready-to-use REST inference API with fast response, no cold starts, and predictable pricing.

AI Kissing generates a romantic kissing video from one or two input images. Upload one image with two people, or two separate images to composite them together. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Virtual Outfit Try-On generates videos of a person wearing uploaded clothing. Upload a portrait and clothing images, add an optional prompt, and get a try-on video. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Twerk generates a fun twerking dance video from a single input image. Upload a photo and the model animates the person into an energetic twerking dance with upbeat hip-hop music. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Talking Photos brings your photos to life — upload a portrait and text, and watch the person speak. Supports 5-15 seconds duration. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Parkour Video generates dynamic parkour action videos from a portrait image. Choose from 6 parkour styles or provide a reference video. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Video Ads generates product advertisement videos. Provide a person photo, product name, and optional product image or script, and AI creates a professional ad video. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Dog Selfie Video generates cute dog selfie videos with customizable breed, style, expression, action, and duration. Ready-to-use REST inference API, no coldstarts, affordable pricing.

WaveSpeed Cinematic Video Generator creates Hollywood-quality videos from text prompts and optional reference images with native audio, director-level camera control, and real-world physics. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

LTX-2.3 with LoRA support is a DiT-based audio-video foundation model designed to generate synchronized video and audio with custom styles, motion, or likeness training. Improved audio and visual quality with enhanced prompt adherence. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

SAM 3 Video RLE is a unified foundation model for prompt-based segmentation in video. Track and segment objects across frames using text, points, or boxes, returning RLE encoded masks for efficient processing. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

LTX-2.3 is a DiT-based audio-video foundation model designed to generate synchronized video and audio within a single model, with improved audio and visual quality as well as enhanced prompt adherence. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Wan 2.1 i2v-480p turns images into unlimited 480p AI videos with the Wan 2.1 image-to-video model, perfect for fast content creation. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

LTX-2.3 is a DiT-based audio-video foundation model designed to generate synchronized video and audio within a single model, with improved audio and visual quality as well as enhanced prompt adherence. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

AI Ghibli Filter Video transforms a photo into a Studio Ghibli anime style video with customizable duration. Ready-to-use REST inference API, no coldstarts, affordable pricing.

RIFE Video Interpolation generates smooth intermediate frames between existing video frames for higher frame rates and smoother motion. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

AI Vocal Remover separates vocals from instrumental in any audio track. Upload an audio file and choose to extract vocals or instrumental. Ready-to-use REST inference API, no coldstarts, affordable pricing.

Depth Anything Video estimates depth maps from video input with temporal consistency. Supports multiple model sizes and colormaps. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

LTX-2.3 with LoRA support is a DiT-based audio-video foundation model designed to generate synchronized video and audio with custom styles, motion, or likeness training. Improved audio and visual quality with enhanced prompt adherence. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

daVinci MagiHuman Text-to-Video API — a 15B parameter omni video generation model, the new open-source king on par with WAN 2.5. Generates high-quality AI videos from text prompts with optional audio input. Supports digital humans, talking heads, flexible aspect ratios, durations, and resolutions. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

LTX-2 19B Video Upscaler converts low-resolution videos into crisp 4K footage with seamless motion dynamics and frame consistency. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

VACE Video Joiner seamlessly joins multiple video clips into one using AI-powered transition generation. Upload 2 to 4 videos and get a smoothly joined result. Ready-to-use REST inference API, no coldstarts, affordable pricing.

LTX-2 19B ControlNet generates synchronized audio-video (up to 20s) from video input with pose, depth, or canny edge guidance. Supports audio preservation, generation, or removal for flexible video transformation. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

SCAIL enables high-fidelity character animation using reference images. It handles large motion variations, stylized characters, and multi-character interactions without explicit per-frame structural guidance. Ready-to-use REST inference API, no coldstarts, affordable pricing.

Wan2.1-DITTO is a unified video-to-video model for realistic style transfer and reenactment, replicating holistic movement and expressions across frames. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Wan2.2-Fun-Control uses Control Codes and multi-modal inputs to generate preset-controlled videos up to 120s at 720p; released under Apache 2.0 for commercial use. Ready-to-use REST API, no coldstarts, affordable.
通过单一 REST API 运行 Best Video Models 系列中的任意模型。按生成计费 — 无订阅、无最低消费 — 在 99.9% 可用性的基础设施上提供行业领先的延迟。
每个 Best Video Models 模型都有按调用计价。价格在每个模型的页面上列出 — 不收取额外的平台费。
大多数 Best Video Models 图像模型在 2 秒内完成。视频和 3D 模型比自托管方案快数倍。
多区域故障转移和自动重试可确保您的生产流量保持在线 — 即使在供应商故障期间。
每个模型在其模型页面上都列有自己的按调用价格。我们按每次成功生成计费,没有订阅费或最低消费。
本系列中的图像模型通常在 2 秒内完成。视频和 3D 模型取决于时长和分辨率,但通常比自托管运行快数倍。
可以 — 每个账户在注册时获得 $1 的免费额度,足以在不使用信用卡的情况下试用大多数 Best Video Models 模型。
标准账户有充足的并发任务限制。企业版计划提供自定义 RPM、更高并发和专用容量 — 详情请联系销售。