Google Veo 3.1 Fast Reference to Video is a fast AI reference-to-video generation model that creates 8-second videos from up to three reference images using the official Veo predictLongRunning endpoint with referenceImages assets. Ready-to-use REST inference API for product videos, character consistency, branded visual storytelling, social media clips, advertising creatives, and professional reference-based video generation workflows with simple integration, no coldstarts, and affordable pricing.
Idle
$0.64per run·~15 / $10
Use the first image as the main subject reference and the second image as the outfit reference. Preserve the same male model, pose, body proportions, camera angle, studio background, and lighting from the first image. Replace his current black leather outfit with the clothing from the second image: a clean white T-shirt, navy blue shorts, white sneakers, and a beige straw hat. Make the outfit fit naturally on his body while keeping the dynamic fashion pose. Maintain realistic fabric texture, natural folds, accurate shadows, and commercial fashion photography quality. The final image should look like a polished summer casual menswear campaign, photorealistic, high detail, clean studio look, no distortion, no extra limbs, no text, no logo.
Google Veo 3.1 Fast Reference-to-Video generates an 8-second video guided by up to three reference images and a text prompt. It is designed for subject, object, and product consistency, making it useful for character-led shots, product motion, style-guided generation, and other reference-driven video workflows.
Reference-guided generation Use up to three reference images to preserve subject, object, or product identity in the generated video.
Fast Veo workflow Built on Google Veo 3.1 Fast for quicker turnaround and efficient iteration.
Consistent 8-second output
Generates a fixed-length 8s MP4, making duration predictable for planning and pricing.
Flexible aspect ratio
Supports both 16:9 and 9:16 for landscape and vertical video use cases.
Optional audio generation
Enable generate_audio when you want the output to include generated sound.
Simple pricing
Pricing depends only on resolution and whether audio generation is enabled.
| Parameter | Required | Description |
|---|---|---|
| prompt | Yes | Motion, scene, and camera instructions. |
| images | Yes | 1–3 reference images. These are sent as asset reference images. |
| aspect_ratio | No | 16:9 or 9:16. Default: 16:9. |
| resolution | No | 720p or 1080p. Default: 720p. |
| generate_audio | No | Whether to generate audio. Default: false. |
| negative_prompt | No | Things to avoid in the video. |
| seed | No | Random seed for reproducibility. |
1–3 images for subject, style, or product guidance.16:9 for landscape or 9:16 for vertical output.720p for lower cost or 1080p for higher quality.generate_audio if you want generated sound in the result.A cinematic product reveal of the same luxury watch from the reference images, rotating slowly on a reflective black surface, dramatic studio lighting, soft camera push-in, premium commercial style
This model generates a fixed 8-second video.
| Mode | Cost |
|---|---|
| 720p without audio | $0.64 |
| 720p with audio | $0.80 |
| 1080p without audio | $0.80 |
| 1080p with audio | $0.96 |
720p without audio costs $0.64720p with audio costs $0.801080p without audio costs $0.801080p with audio costs $0.96resolution and generate_audioaspect_ratio, negative_prompt, seed, and the number of reference images do not affect pricing9:16 outputs for short-form mobile platforms.negative_prompt to reduce unwanted style drift or artifacts.seed when you want more reproducible generations.prompt and images are required.referenceImages added to the request payload.generate_audio defaults to false.Grab a WaveSpeedAI API key, then call POST https://api.wavespeed.ai/api/v3/google/veo3.1-fast/reference-to-video with your input as JSON. The endpoint returns a prediction id; poll the prediction endpoint until status flips to completed, then read the output URL from data.outputs[0]. Examples for Veo3.1 Fast Reference To Video below.
# Submit the prediction
curl -X POST "https://api.wavespeed.ai/api/v3/google/veo3.1-fast/reference-to-video" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $WAVESPEED_API_KEY" \
-d '{
"prompt": "A cinematic shot of a city at sunset, soft golden light",
"aspect_ratio": "16:9",
"resolution": "720p",
"generate_audio": false,
"negative_prompt": "blurry, low quality, distorted",
"seed": -1
}'
# Response includes a prediction id. Poll for the result:
curl -X GET "https://api.wavespeed.ai/api/v3/predictions/{request_id}/result" \
-H "Authorization: Bearer $WAVESPEED_API_KEY"
# When status is "completed", read the output from data.outputs[0].// npm install wavespeed
const WaveSpeed = require('wavespeed');
const client = new WaveSpeed(); // reads WAVESPEED_API_KEY from env
const result = await client.run("google/veo3.1-fast/reference-to-video", {
"prompt": "A cinematic shot of a city at sunset, soft golden light",
"aspect_ratio": "16:9",
"resolution": "720p",
"generate_audio": false,
"negative_prompt": "blurry, low quality, distorted",
"seed": -1
});
console.log(result.outputs[0]); // → URL of the generated output# pip install wavespeed
import wavespeed
output = wavespeed.run(
"google/veo3.1-fast/reference-to-video",
{
"prompt": "A cinematic shot of a city at sunset, soft golden light",
"aspect_ratio": "16:9",
"resolution": "720p",
"generate_audio": false,
"negative_prompt": "blurry, low quality, distorted",
"seed": -1
}
)
print(output["outputs"][0]) # → URL of the generated outputVeo3.1 Fast Reference To Video is a Google model for video generation from images, exposed as a REST API on WaveSpeedAI. Google Veo 3.1 Fast Reference to Video is a fast AI reference-to-video generation model that creates 8-second videos from up to three reference images using the official Veo predictLongRunning endpoint with referenceImages assets. Ready-to-use REST inference API for product videos, character consistency, branded visual storytelling, social media clips, advertising creatives, and professional reference-based video generation workflows with simple integration, no coldstarts, and affordable pricing. You can call it programmatically or try it from the playground above.
POST your input parameters to the model's REST endpoint (shown in the API tab of this playground) with your WaveSpeedAI API key in the Authorization header. Submission returns a prediction ID; poll the prediction endpoint until status flips to "completed", then read the output URL from the result. The playground generates a ready-to-paste code sample in Python, JavaScript, or cURL for whatever inputs you've set. Full request/response shape is documented at https://wavespeed.ai/docs/docs-api/google/google-veo3.1-fast-reference-to-video.
Veo3.1 Fast Reference To Video starts at $0.64 per run. That figure is the base price — the final charge scales with the parameters you set in the form (output size, length, count, references, or whatever knobs this model exposes), so a higher-quality or larger output costs more than a minimal one. The exact cost for your current input is shown live next to the Generate button before you submit, and the actual per-call charge is recorded on the prediction afterwards.
Key inputs: `prompt`, `images`, `aspect_ratio`, `resolution`, `seed`, `negative_prompt`. The full JSON schema (types, defaults, allowed values) is rendered above the Generate button and mirrored in the API reference at https://wavespeed.ai/docs/docs-api/google/google-veo3.1-fast-reference-to-video.
Sign up for a free WaveSpeedAI account to claim starter credits, copy your API key from /accesskey, then call the endpoint shown in the API tab of the playground. The playground also auto-generates a code sample in Python, JavaScript, or cURL for the parameters you've set.
Commercial usage rights depend on the model's license, set by its provider (Google). The license summary appears on the model card above; see WaveSpeedAI's Terms of Service for platform-level conditions.