First and Last Frame Video - Controlled AI video transitions with keyframe interpolation

Available on WaveSpeed

First & Last Frame Video — Controlled AI Transitions API

Define the beginning and the end — let AI create the journey. Upload start and end images and let AI generate smooth, controlled transitions using Kling and Luma models.

Generate Bridge Video API DocsImage GeneratorFree Video GeneratorFree

Controlled AI Video Transitions

First and Last Frame video generation gives you goal-oriented control — define where the video starts and ends, and AI fills the gap with natural motion.

Dual-Image Conditioning

Upload both a start frame and an end frame. The AI analyzes both inputs and generates intermediate frames that create a seamless, physically plausible transition between the two keyframes.

Goal-Oriented Video Control

Unlike standard image-to-video which only controls the start, First and Last Frame ensures the video ends exactly where you want. Critical for storytelling, editing, and precise visual narratives.

Multi-Model Keyframe Support

WaveSpeed aggregates the best models that offer precise start and end frame conditioning — including Kling and Luma. Choose the right model for your specific transition needs.

First & Last Frame on WaveSpeed vs. Standard Image-to-Video

See why creators choose keyframe-controlled video on WaveSpeed over standard methods.

End frame control

✗No control over how video ends

✓Exact end frame guaranteed

Motion direction

✗AI decides motion randomly

✓Context-aware interpolation

Storytelling

✗Unpredictable narrative flow

✓Goal-oriented transitions

Infrastructure

✗Self-hosted GPU management

✓Fully managed, auto-scaling

API access

✗No standard API available

✓REST API + Python/JS SDKs

Cost

✗$3,000+/mo reserved GPU

✓Pay per generation, no minimum

Performance at a Glance

First and Last Frame video on WaveSpeed delivers controlled, reliable transitions at scale.

10sMax transition duration

<20sGeneration speed

99.99%Uptime SLA

$0No upfront costs

Examples

Portrait

Young woman turning to smile at camera, breeze catching her scarf, soft bokeh background.

Dance

Dancer performing a graceful pirouette, flowing dress creating motion trails, spotlight.

Nature

Butterfly emerging from chrysalis in close-up, wings slowly unfurling, soft natural light.

Cinematic

Detective walking through foggy city streets, trench coat collar up, film noir atmosphere.

Integrate in Minutes

Production-ready SDKs for Python and JavaScript. REST API with full OpenAPI spec. Webhook support for async jobs.

Dual-image input for start and end frame
Multiple keyframe-capable models available
Python & JavaScript SDKs + REST API

API Docs Get API Key

import wavespeed

output = wavespeed.run(

"wavespeed-ai/first-last-frame",

{

"first_frame": "https://example.com/start.jpg",

"last_frame": "https://example.com/end.jpg",

"prompt": "smooth camera pan from left to right",

}

)

print(output["outputs"][0])

Get Any Tool You Want

1000+ models across image, video, audio, and 3D — all through one API.

Explore All Models →

Flux Image Tools

flux-2-max/text-to-imageflux-2-max/editflux-2-flash/text-to-imageflux-2-flash/edit

Seedream AI Models

seedream-v4.5/editseedream-v4.5/text-to-imageseedream-v4.0/text-to-image

Google Models

nano-banana-pro/text-to-imagenano-banana-2/text-to-imagenano-banana-pro/editnano-banana-2/edit

Flux Kontext Models

flux-kontext-maxflux-kontext-proflux-kontext-devflux-kontext-dev-ultra-fast

Qwen Image 2 Models

qwen-image-2.0-pro/text-to-imageqwen-image-2.0/editqwen-image-2.0-pro/edit

Image Editing

flux-2-max/editseedream-v4.5/editnano-banana-pro/editqwen-image-2.0/edit

Flux Image Tools

flux-2-max/text-to-imageflux-2-max/editflux-2-flash/text-to-imageflux-2-flash/edit

Seedream AI Models

seedream-v4.5/editseedream-v4.5/text-to-imageseedream-v4.0/text-to-image

Google Models

nano-banana-pro/text-to-imagenano-banana-2/text-to-imagenano-banana-pro/editnano-banana-2/edit

Flux Kontext Models

flux-kontext-maxflux-kontext-proflux-kontext-devflux-kontext-dev-ultra-fast

Qwen Image 2 Models

qwen-image-2.0-pro/text-to-imageqwen-image-2.0/editqwen-image-2.0-pro/edit

Image Editing

flux-2-max/editseedream-v4.5/editnano-banana-pro/editqwen-image-2.0/edit

Wan 2.6 Models

wan-2.6/image-to-videowan-2.6/image-to-video-spicywan-2.6/text-to-video

Seedance Video Models

seedance-v1.5-pro/image-to-videoseedance-v1.5-pro/text-to-videoseedance-v1.5-pro/image-to-video-fast

Kling Models

kling-v3.0-pro/image-to-videokling-v3.0-pro/text-to-videokling-v2.6-pro/motion-control

Minimax Hailuo Models

hailuo-2.3/i2v-prohailuo-2.3/fasthailuo-2.3/t2v-pro

Grok Models

grok-2-imagegrok-imagine-video/text-to-videogrok-imagine-video/image-to-video

Runwayml AI Models

gen4-alephgen4-turbogen4-imagegen4-image-turbo

Wan 2.6 Models

wan-2.6/image-to-videowan-2.6/image-to-video-spicywan-2.6/text-to-video

Seedance Video Models

seedance-v1.5-pro/image-to-videoseedance-v1.5-pro/text-to-videoseedance-v1.5-pro/image-to-video-fast

Kling Models

kling-v3.0-pro/image-to-videokling-v3.0-pro/text-to-videokling-v2.6-pro/motion-control

Minimax Hailuo Models

hailuo-2.3/i2v-prohailuo-2.3/fasthailuo-2.3/t2v-pro

Grok Models

grok-2-imagegrok-imagine-video/text-to-videogrok-imagine-video/image-to-video

Runwayml AI Models

gen4-alephgen4-turbogen4-imagegen4-image-turbo

Explore All Models →

Try It Now

AI Image Generator

FLUX, Seedream, Nano Banana & 1000+ models. Try free →

AI Video Generator

Wan, Seedance, Kling, Hailuo & more. Try free →

FAQ

It is a video generation technique where the user provides both the starting image and the ending image. The AI's job is to generate the intermediate frames (interpolation) to create a video that starts exactly at image A and ends exactly at image B.

Standard Image-to-Video only lets you control the start. The ending is unpredictable. First and Last Frame gives you "Goal-Oriented" control, ensuring the video ends exactly where you want it to, which is crucial for storytelling and editing.

Yes, but the result will be a "morph" or a surreal transition. For realistic video, the two images should be logically connected (e.g., same character in different poses, or same room with different lighting).

Most models support 5 to 10 seconds of generation between frames. For longer sequences, you would generate multiple "bridges" (A to B, then B to C) and stitch them together.

Yes. The text prompt tells the AI the context of the change. If you have a Start Frame of a man standing and an End Frame of him sitting, the prompt "The man sits down slowly" helps the AI generate the correct motion.

Ready to Create Controlled AI Video Transitions?

Start Free Trial