
Brainstorm, generate, edit, and iterate faster across images and videos with WaveSpeedAI.

AI Video Converter converts videos between formats. Upload a video and specify the target format to get a converted result. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Video FPS Increaser doubles your video frame rate for smoother motion and better playback quality. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

FlashVSR is a fast, high-quality video upscaler that boosts resolution and restores clarity for low-resolution or blurry footage. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

WaveSpeedAI Video Outpainter expands any video beyond its original boundaries while preserving motion, identity, and scene coherence. Perfect for aspect-ratio changes, reframing, adding safe margins, or generating new visual context without cropping or losing content.

Video Body Swap replaces the body in a target video with your face. Upload a face image and a body video to get a seamless swap. Ready-to-use REST inference API, no coldstarts, affordable pricing.

WaveSpeed Video Background Remover replaces or removes video backgrounds with a custom image. Upload or paste a link to your video, then provide a background image by URL or file—clean matting, edge-aware blending, and natural compositing keep subjects realistic. Built for creator workflows and batch jobs. Ready-to-use REST inference API with fast response, no cold starts, and predictable pricing.

AI Kissing generates a romantic kissing video from one or two input images. Upload one image with two people, or two separate images to composite them together. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Virtual Outfit Try-On generates videos of a person wearing uploaded clothing. Upload a portrait and clothing images, add an optional prompt, and get a try-on video. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Twerk generates a fun twerking dance video from a single input image. Upload a photo and the model animates the person into an energetic twerking dance with upbeat hip-hop music. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Talking Photos brings your photos to life — upload a portrait and text, and watch the person speak. Supports 5-15 seconds duration. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Parkour Video generates dynamic parkour action videos from a portrait image. Choose from 6 parkour styles or provide a reference video. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Video Ads generates product advertisement videos. Provide a person photo, product name, and optional product image or script, and AI creates a professional ad video. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Dog Selfie Video generates cute dog selfie videos with customizable breed, style, expression, action, and duration. Ready-to-use REST inference API, no coldstarts, affordable pricing.

WaveSpeed Cinematic Video Generator creates Hollywood-quality videos from text prompts and optional reference images with native audio, director-level camera control, and real-world physics. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

LTX-2.3 with LoRA support is a DiT-based audio-video foundation model designed to generate synchronized video and audio with custom styles, motion, or likeness training. Improved audio and visual quality with enhanced prompt adherence. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

SAM 3 Video RLE is a unified foundation model for prompt-based segmentation in video. Track and segment objects across frames using text, points, or boxes, returning RLE encoded masks for efficient processing. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

LTX-2.3 is a DiT-based audio-video foundation model designed to generate synchronized video and audio within a single model, with improved audio and visual quality as well as enhanced prompt adherence. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Wan 2.1 i2v-480p turns images into unlimited 480p AI videos with the Wan 2.1 image-to-video model, perfect for fast content creation. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

LTX-2.3 is a DiT-based audio-video foundation model designed to generate synchronized video and audio within a single model, with improved audio and visual quality as well as enhanced prompt adherence. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

AI Ghibli Filter Video transforms a photo into a Studio Ghibli anime style video with customizable duration. Ready-to-use REST inference API, no coldstarts, affordable pricing.

RIFE Video Interpolation generates smooth intermediate frames between existing video frames for higher frame rates and smoother motion. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

AI Vocal Remover separates vocals from instrumental in any audio track. Upload an audio file and choose to extract vocals or instrumental. Ready-to-use REST inference API, no coldstarts, affordable pricing.

Depth Anything Video estimates depth maps from video input with temporal consistency. Supports multiple model sizes and colormaps. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

LTX-2.3 with LoRA support is a DiT-based audio-video foundation model designed to generate synchronized video and audio with custom styles, motion, or likeness training. Improved audio and visual quality with enhanced prompt adherence. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

daVinci MagiHuman Text-to-Video API — a 15B parameter omni video generation model, the new open-source king on par with WAN 2.5. Generates high-quality AI videos from text prompts with optional audio input. Supports digital humans, talking heads, flexible aspect ratios, durations, and resolutions. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

LTX-2 19B Video Upscaler converts low-resolution videos into crisp 4K footage with seamless motion dynamics and frame consistency. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

VACE Video Joiner seamlessly joins multiple video clips into one using AI-powered transition generation. Upload 2 to 4 videos and get a smoothly joined result. Ready-to-use REST inference API, no coldstarts, affordable pricing.

LTX-2 19B ControlNet generates synchronized audio-video (up to 20s) from video input with pose, depth, or canny edge guidance. Supports audio preservation, generation, or removal for flexible video transformation. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

SCAIL enables high-fidelity character animation using reference images. It handles large motion variations, stylized characters, and multi-character interactions without explicit per-frame structural guidance. Ready-to-use REST inference API, no coldstarts, affordable pricing.

Wan2.1-DITTO is a unified video-to-video model for realistic style transfer and reenactment, replicating holistic movement and expressions across frames. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Wan2.2-Fun-Control uses Control Codes and multi-modal inputs to generate preset-controlled videos up to 120s at 720p; released under Apache 2.0 for commercial use. Ready-to-use REST API, no coldstarts, affordable.
Best Video Models 컬렉션의 모든 모델을 단일 REST API로 실행하세요. 생성당 과금 — 구독 없음, 최소 요금 없음 — 99.9% 가동률 인프라에서 업계 최고의 지연 시간을 제공합니다.
모든 Best Video Models 모델에 대한 호출당 가격. 가격은 각 모델 페이지에 표시되며 플랫폼 수수료는 추가되지 않습니다.
대부분의 Best Video Models 이미지 모델은 2초 이내에 완료됩니다. 비디오 및 3D 모델은 셀프 호스팅 대안보다 몇 배 더 빠릅니다.
다중 리전 페일오버와 자동 재시도로 프로바이더 장애 중에도 운영 트래픽을 온라인 상태로 유지합니다.
각 모델에는 모델 페이지에 호출당 자체 가격이 표시되어 있습니다. 성공한 생성 단위로 청구되며 구독 요금이나 최소 요금은 없습니다.
이 컬렉션의 이미지 모델은 일반적으로 2초 이내에 완료됩니다. 비디오 및 3D 모델은 길이와 해상도에 따라 다르지만 보통 셀프 호스팅 실행보다 몇 배 더 빠릅니다.
예 — 가입 시 모든 계정에 $1의 무료 크레딧이 제공되며, 신용카드 없이 대부분의 Best Video Models 모델을 시도하기에 충분합니다.
표준 계정에는 넉넉한 동시 작업 제한이 있습니다. Enterprise 플랜은 맞춤형 RPM, 더 높은 동시성, 전용 용량을 제공합니다 — 자세한 내용은 영업팀에 문의하세요.