
Brainstorm, generate, edit, and iterate faster across images and videos with WaveSpeedAI.

AI Video Converter converts videos between formats. Upload a video and specify the target format to get a converted result. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Video Converter converts videos between formats. Upload a video and specify the target format to get a converted result. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Video FPS Increaser doubles your video frame rate for smoother motion and better playback quality. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

FlashVSR is a fast, high-quality video upscaler that boosts resolution and restores clarity for low-resolution or blurry footage. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

WaveSpeedAI Video Outpainter expands any video beyond its original boundaries while preserving motion, identity, and scene coherence. Perfect for aspect-ratio changes, reframing, adding safe margins, or generating new visual context without cropping or losing content.

Video Body Swap replaces the body in a target video with your face. Upload a face image and a body video to get a seamless swap. Ready-to-use REST inference API, no coldstarts, affordable pricing.

WaveSpeed Video Background Remover replaces or removes video backgrounds with a custom image. Upload or paste a link to your video, then provide a background image by URL or file—clean matting, edge-aware blending, and natural compositing keep subjects realistic. Built for creator workflows and batch jobs. Ready-to-use REST inference API with fast response, no cold starts, and predictable pricing.

AI Kissing generates a romantic kissing video from one or two input images. Upload one image with two people, or two separate images to composite them together. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Virtual Outfit Try-On generates videos of a person wearing uploaded clothing. Upload a portrait and clothing images, add an optional prompt, and get a try-on video. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Twerk generates a fun twerking dance video from a single input image. Upload a photo and the model animates the person into an energetic twerking dance with upbeat hip-hop music. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Talking Photos brings your photos to life — upload a portrait and text, and watch the person speak. Supports 5-15 seconds duration. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Parkour Video generates dynamic parkour action videos from a portrait image. Choose from 6 parkour styles or provide a reference video. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Video Ads generates product advertisement videos. Provide a person photo, product name, and optional product image or script, and AI creates a professional ad video. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Dog Selfie Video generates cute dog selfie videos with customizable breed, style, expression, action, and duration. Ready-to-use REST inference API, no coldstarts, affordable pricing.

WaveSpeed Cinematic Video Generator creates Hollywood-quality videos from text prompts and optional reference images with native audio, director-level camera control, and real-world physics. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

LTX-2.3 with LoRA support is a DiT-based audio-video foundation model designed to generate synchronized video and audio with custom styles, motion, or likeness training. Improved audio and visual quality with enhanced prompt adherence. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

SAM 3 Video RLE is a unified foundation model for prompt-based segmentation in video. Track and segment objects across frames using text, points, or boxes, returning RLE encoded masks for efficient processing. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

LTX-2.3 is a DiT-based audio-video foundation model designed to generate synchronized video and audio within a single model, with improved audio and visual quality as well as enhanced prompt adherence. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Wan 2.1 i2v-480p turns images into unlimited 480p AI videos with the Wan 2.1 image-to-video model, perfect for fast content creation. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

LTX-2.3 is a DiT-based audio-video foundation model designed to generate synchronized video and audio within a single model, with improved audio and visual quality as well as enhanced prompt adherence. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

AI Ghibli Filter Video transforms a photo into a Studio Ghibli anime style video with customizable duration. Ready-to-use REST inference API, no coldstarts, affordable pricing.

RIFE Video Interpolation generates smooth intermediate frames between existing video frames for higher frame rates and smoother motion. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

AI Vocal Remover separates vocals from instrumental in any audio track. Upload an audio file and choose to extract vocals or instrumental. Ready-to-use REST inference API, no coldstarts, affordable pricing.

Depth Anything Video estimates depth maps from video input with temporal consistency. Supports multiple model sizes and colormaps. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

LTX-2.3 with LoRA support is a DiT-based audio-video foundation model designed to generate synchronized video and audio with custom styles, motion, or likeness training. Improved audio and visual quality with enhanced prompt adherence. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

daVinci MagiHuman Text-to-Video API — a 15B parameter omni video generation model, the new open-source king on par with WAN 2.5. Generates high-quality AI videos from text prompts with optional audio input. Supports digital humans, talking heads, flexible aspect ratios, durations, and resolutions. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

LTX-2 19B Video Upscaler converts low-resolution videos into crisp 4K footage with seamless motion dynamics and frame consistency. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

VACE Video Joiner seamlessly joins multiple video clips into one using AI-powered transition generation. Upload 2 to 4 videos and get a smoothly joined result. Ready-to-use REST inference API, no coldstarts, affordable pricing.

LTX-2 19B ControlNet generates synchronized audio-video (up to 20s) from video input with pose, depth, or canny edge guidance. Supports audio preservation, generation, or removal for flexible video transformation. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

SCAIL enables high-fidelity character animation using reference images. It handles large motion variations, stylized characters, and multi-character interactions without explicit per-frame structural guidance. Ready-to-use REST inference API, no coldstarts, affordable pricing.

Wan2.1-DITTO is a unified video-to-video model for realistic style transfer and reenactment, replicating holistic movement and expressions across frames. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Wan2.2-Fun-Control uses Control Codes and multi-modal inputs to generate preset-controlled videos up to 120s at 720p; released under Apache 2.0 for commercial use. Ready-to-use REST API, no coldstarts, affordable.
Execute qualquer modelo da coleção Best Video Models por meio de uma única API REST. Pague por geração — sem assinaturas, sem mínimos — com latência líder do setor numa infraestrutura com 99,9% de disponibilidade.
Preço por chamada para cada modelo Best Video Models. O preço aparece na página de cada modelo — sem taxas de plataforma adicionais.
A maioria dos modelos de imagem Best Video Models termina em menos de 2 segundos. Modelos de vídeo e 3D são várias vezes mais rápidos que alternativas auto-hospedadas.
Failover multirregião e novas tentativas automáticas mantêm seu tráfego de produção online — mesmo durante quedas do provedor.
Cada modelo tem seu próprio preço por chamada listado na página do modelo. Cobramos por geração bem-sucedida, sem taxas de assinatura nem mínimos.
Os modelos de imagem desta coleção normalmente terminam em menos de 2 segundos. Modelos de vídeo e 3D dependem da duração e resolução, mas costumam ser várias vezes mais rápidos do que execuções auto-hospedadas.
Sim — toda conta recebe US$ 1 em créditos grátis no cadastro, suficiente para experimentar a maioria dos modelos Best Video Models sem cartão de crédito.
Contas padrão têm limites generosos de tarefas concorrentes. Planos Enterprise oferecem RPM personalizado, maior concorrência e capacidade dedicada — entre em contato com vendas para detalhes.
Navegue por nosso catálogo completo de modelos de IA de última geração — imagem, vídeo, 3D, áudio, LLM e muito mais.
wavespeed.ai/models →Integre IA em seus próprios aplicativos. API RESTful com bibliotecas cliente — sem cold starts, pague por uso.
wavespeed.ai/docs →