← Blog

Introducing VOID Video Inpainting on WaveSpeedAI

VOID Video Inpainting — remove objects from video with mask-guided AI inpainting. Quad-mask or auto SAM-3 masks, optional Pass 2 refinement for temporal consistency. Now live on WaveSpeedAI.

4 min read
Wavespeed Ai Void Video Inpainting Mask VOID Video Inpainting — remove objects from video with mask-...
Try it

Clean Object Removal in Video — Finally a Working One

Removing an unwanted object from a video has historically been the hard problem in post — the frame-by-frame rotoscope, the clone-stamp dance, the temporal flicker. Most AI attempts have produced shimmering, unstable results that only look good in single frames. VOID Video Inpainting takes a different approach: mask-guided inpainting with optional Pass 2 refinement that locks temporal consistency. We’re excited to announce that VOID Video Inpainting is now live on WaveSpeedAI.

What Is VOID Video Inpainting?

VOID Video Inpainting is a mask-guided video object-removal model. You provide:

  • An input video.
  • A mask — either supplied manually (a quad mask) or auto-generated via SAM-3.

VOID produces a clean version of the video with the masked region inpainted, filled by context-aware content that matches the surrounding scene across time, not just per frame.

Key Features

Quad-Mask or Auto-SAM-3 Masks Supply your own mask, or let the built-in SAM-3 integration generate one automatically from a bounding region or click prompt.

Optional Pass 2 Refinement Enable enable_pass2_refinement to run a second refinement pass that sharpens temporal consistency — dramatically reducing flicker on difficult shots.

Adjustable Denoising and Guidance Tune denoising_steps, guidance_scale, and temporal_window_size for the quality/cost tradeoff your shot needs.

Temporal Window Control Set the number of frames the model reasons over at once — larger windows preserve motion coherence on fast-moving content.

Production REST API Not a research demo — a hardened endpoint ready to drop into post-production and editing pipelines.

Real-World Use Cases

On-Set Mistakes — Unwanted Gear, Crew, Signage

Remove that boom mic, that reflection of the director, that passing truck — without a rotoscope artist.

Social / UGC Cleanup

Creators can wipe unwanted people, logos, or background clutter from phone-shot video.

E-Commerce Video Cleanup

Remove tags, fingers, or reflections from product videos on a per-SKU basis.

Archival and Documentary Restoration

Clean up archival footage — wires, trash, damage — while preserving the original aesthetic.

VFX Plate Preparation

Pre-clean plates before CG element insertion. The difference between a clean plate and a messy one is the difference between a 2-hour comp and a 2-day one.

Privacy and Compliance

Remove identifying marks, faces, or license plates from training data, documentation video, or publication material.

Getting Started on WaveSpeedAI

  1. Upload your source video.
  2. Provide a mask — quad-mask, pre-made mask video, or let SAM-3 generate one.
  3. Tune refinement settings — Pass 2 for best quality, skip for speed.
  4. Submit — production REST API, no cold starts.

Full schema on the model page.

Pricing

  • Base rate: $0.05 per second of source video.
  • Pass 2 refinement: 2× base (highly recommended for publishable work).
  • Supplied mask video: +$0.05 per second.

A 10-second clip with Pass 2 enabled and auto-generated mask runs $1.00. Add a supplied mask video and it’s $1.50.

Why Run VOID Video Inpainting on WaveSpeedAI

  • One API across the video stack. Chain VOID inpainting with generation, upscaling, and editing models through one endpoint.
  • No cold starts. Critical for interactive post-production tools.
  • Transparent per-second + add-on pricing. Predictable billing for studios.
  • Production-scale throughput. Fan out batch jobs across a full shoot’s worth of plates.

Pro Tips

  • Always try Pass 2 on hero shots. The temporal-consistency win is worth 2× the cost.
  • SAM-3 masks are great starting points. For tricky edges, review and manually refine before submitting.
  • Smaller masks inpaint cleaner. Generous masks give the model less context to lean on.
  • Use a larger temporal window for fast motion. Action shots benefit from the extended horizon; static shots don’t need it.
  • Run low-res proofs first. Dial in mask and settings on a cheap pass; kick the final at full res + Pass 2.

Start Creating Today

VOID Video Inpainting is the cleanest object-removal pipeline we’ve seen packaged as a single API call — with real temporal consistency, not just per-frame magic.

Try VOID Video Inpainting now on WaveSpeedAI and remove unwanted content from video in one call.