
Brainstorm, generate, edit, and iterate faster across images and videos with WaveSpeedAI.

AI Video Converter converts videos between formats. Upload a video and specify the target format to get a converted result. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Video Converter converts videos between formats. Upload a video and specify the target format to get a converted result. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Video FPS Increaser doubles your video frame rate for smoother motion and better playback quality. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

FlashVSR is a fast, high-quality video upscaler that boosts resolution and restores clarity for low-resolution or blurry footage. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

WaveSpeedAI Video Outpainter expands any video beyond its original boundaries while preserving motion, identity, and scene coherence. Perfect for aspect-ratio changes, reframing, adding safe margins, or generating new visual context without cropping or losing content.

Video Body Swap replaces the body in a target video with your face. Upload a face image and a body video to get a seamless swap. Ready-to-use REST inference API, no coldstarts, affordable pricing.

WaveSpeed Video Background Remover replaces or removes video backgrounds with a custom image. Upload or paste a link to your video, then provide a background image by URL or file—clean matting, edge-aware blending, and natural compositing keep subjects realistic. Built for creator workflows and batch jobs. Ready-to-use REST inference API with fast response, no cold starts, and predictable pricing.

AI Kissing generates a romantic kissing video from one or two input images. Upload one image with two people, or two separate images to composite them together. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Virtual Outfit Try-On generates videos of a person wearing uploaded clothing. Upload a portrait and clothing images, add an optional prompt, and get a try-on video. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Twerk generates a fun twerking dance video from a single input image. Upload a photo and the model animates the person into an energetic twerking dance with upbeat hip-hop music. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Talking Photos brings your photos to life — upload a portrait and text, and watch the person speak. Supports 5-15 seconds duration. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Parkour Video generates dynamic parkour action videos from a portrait image. Choose from 6 parkour styles or provide a reference video. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Video Ads generates product advertisement videos. Provide a person photo, product name, and optional product image or script, and AI creates a professional ad video. Ready-to-use REST inference API, no coldstarts, affordable pricing.

AI Dog Selfie Video generates cute dog selfie videos with customizable breed, style, expression, action, and duration. Ready-to-use REST inference API, no coldstarts, affordable pricing.

WaveSpeed Cinematic Video Generator creates Hollywood-quality videos from text prompts and optional reference images with native audio, director-level camera control, and real-world physics. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

LTX-2.3 with LoRA support is a DiT-based audio-video foundation model designed to generate synchronized video and audio with custom styles, motion, or likeness training. Improved audio and visual quality with enhanced prompt adherence. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

SAM 3 Video RLE is a unified foundation model for prompt-based segmentation in video. Track and segment objects across frames using text, points, or boxes, returning RLE encoded masks for efficient processing. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

LTX-2.3 is a DiT-based audio-video foundation model designed to generate synchronized video and audio within a single model, with improved audio and visual quality as well as enhanced prompt adherence. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Wan 2.1 i2v-480p turns images into unlimited 480p AI videos with the Wan 2.1 image-to-video model, perfect for fast content creation. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

LTX-2.3 is a DiT-based audio-video foundation model designed to generate synchronized video and audio within a single model, with improved audio and visual quality as well as enhanced prompt adherence. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

AI Ghibli Filter Video transforms a photo into a Studio Ghibli anime style video with customizable duration. Ready-to-use REST inference API, no coldstarts, affordable pricing.

RIFE Video Interpolation generates smooth intermediate frames between existing video frames for higher frame rates and smoother motion. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

AI Vocal Remover separates vocals from instrumental in any audio track. Upload an audio file and choose to extract vocals or instrumental. Ready-to-use REST inference API, no coldstarts, affordable pricing.

Depth Anything Video estimates depth maps from video input with temporal consistency. Supports multiple model sizes and colormaps. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

LTX-2.3 with LoRA support is a DiT-based audio-video foundation model designed to generate synchronized video and audio with custom styles, motion, or likeness training. Improved audio and visual quality with enhanced prompt adherence. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

daVinci MagiHuman Text-to-Video API — a 15B parameter omni video generation model, the new open-source king on par with WAN 2.5. Generates high-quality AI videos from text prompts with optional audio input. Supports digital humans, talking heads, flexible aspect ratios, durations, and resolutions. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

LTX-2 19B Video Upscaler converts low-resolution videos into crisp 4K footage with seamless motion dynamics and frame consistency. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

VACE Video Joiner seamlessly joins multiple video clips into one using AI-powered transition generation. Upload 2 to 4 videos and get a smoothly joined result. Ready-to-use REST inference API, no coldstarts, affordable pricing.

LTX-2 19B ControlNet generates synchronized audio-video (up to 20s) from video input with pose, depth, or canny edge guidance. Supports audio preservation, generation, or removal for flexible video transformation. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

SCAIL enables high-fidelity character animation using reference images. It handles large motion variations, stylized characters, and multi-character interactions without explicit per-frame structural guidance. Ready-to-use REST inference API, no coldstarts, affordable pricing.

Wan2.1-DITTO is a unified video-to-video model for realistic style transfer and reenactment, replicating holistic movement and expressions across frames. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Wan2.2-Fun-Control uses Control Codes and multi-modal inputs to generate preset-controlled videos up to 120s at 720p; released under Apache 2.0 for commercial use. Ready-to-use REST API, no coldstarts, affordable.
Exécutez n'importe quel modèle de la collection Best Video Models via une seule API REST. Paiement à la génération — sans abonnement ni minimum — avec une latence à l'état de l'art sur une infrastructure à 99,9 % de disponibilité.
Tarification à l'appel pour chaque modèle Best Video Models. Le prix est indiqué sur la page de chaque modèle — pas de frais de plateforme en plus.
La plupart des modèles d'image Best Video Models s'exécutent en moins de 2 secondes. Les modèles vidéo et 3D sont plusieurs fois plus rapides que les alternatives auto-hébergées.
Bascule multi-régions et nouvelles tentatives automatiques maintiennent votre trafic de production en ligne — même en cas de panne fournisseur.
Chaque modèle a son propre prix par appel indiqué sur sa page. Nous facturons à chaque génération réussie, sans abonnement ni minimum.
Les modèles d'image de cette collection se terminent généralement en moins de 2 secondes. Les modèles vidéo et 3D dépendent de la durée et de la résolution mais sont en général plusieurs fois plus rapides que les exécutions auto-hébergées.
Oui — chaque compte reçoit 1 $ de crédits offerts à l'inscription, suffisant pour essayer la plupart des modèles Best Video Models sans carte de crédit.
Les comptes standard ont des limites de jobs concurrents généreuses. Les plans Enterprise proposent un RPM personnalisé, une concurrence plus élevée et de la capacité dédiée — contactez le service commercial pour les détails.
Parcourez notre catalogue complet de modèles d'IA à la pointe de la technologie — image, vidéo, 3D, audio, LLM et plus.
wavespeed.ai/models →Intégrez l'IA dans vos propres apps. API RESTful avec des bibliothèques client — pas de démarrages à froid, paiement à l'usage.
wavespeed.ai/docs →