Shengshu·video·From $0.15/run

Vidu Q3 API

Shengshu Vidu Q3 — text-to-video, image-to-video, reference-to-video (1-4 reference images for multi-entity consistency), and start-end-to-video. Three tiers: Standard, Pro, Turbo. Up to 16-second outputs on some variants.

Standard, Pro (1-16s outputs), and Turbo (faster) tiers. Reference-to-video accepts 1-4 reference images for multi-entity consistent videos at 360p-1080p, up to 16 seconds. Start-end-to-video bridges two keyframes (1-16s on Pro). image-to-video-pro variant supports 720p/1080p/2K/4K.

Open Playground →View API Docs

About the Vidu Q3 API

What Vidu Q3 does, how it fits in the Shengshu model lineup, and why teams reach for it.

Vidu Q3 is a video generation model from Shengshu, available through the WaveSpeedAI REST API. Shengshu Vidu Q3 — text-to-video, image-to-video, reference-to-video (1-4 reference images for multi-entity consistency), and start-end-to-video. Three tiers: Standard, Pro, Turbo. Up to 16-second outputs on some variants.

Standard, Pro (1-16s outputs), and Turbo (faster) tiers. Reference-to-video accepts 1-4 reference images for multi-entity consistent videos at 360p-1080p, up to 16 seconds. Start-end-to-video bridges two keyframes (1-16s on Pro). image-to-video-pro variant supports 720p/1080p/2K/4K.

The Vidu Q3 family on WaveSpeedAI ships 14 REST endpoints covering Image-To-Video, Text-To-Video workflows. Each variant carries its own pricing, parameter knobs, and example outputs — pick the one that matches your input modality and production constraints, or call several from the same API key to compose multi-step pipelines.

Run Vidu Q3 through the same API key, billing account, and rate-limit envelope you use for the other 1,000+ AI models on WaveSpeedAI. No separate vendor setup, no per-provider SDKs, no per-vendor rate-limit envelopes — one integration covers everything from text-to-image and text-to-video through audio synthesis, 3D generation, upscaling, and editing.

All Vidu Q3 API endpoints

14 Vidu Q3 endpoints available now on WaveSpeedAI — pick the variant that matches your workflow.

Q3 Image To Video Spicy

Vidu Q3 Image-to-Video Spicy generates unlimited high-quality videos from images with smooth animations and diverse motion, optimized for scalable content generation. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-videofrom $0.35

Q3 Pro Text To Video

Vidu Q3 Pro Text to Video is a fast AI video generation model that creates high-quality, audio-capable videos from text prompts with support for 1–16 second outputs. Ready-to-use REST inference API for cinematic clips, advertising creatives, social media videos, product visuals, storytelling, and professional text-to-video workflows with simple integration, no coldstarts, and affordable pricing.

text-to-videofrom $0.25

Q3 Pro Start End To Video

Vidu Q3 Pro Start-End-to-Video creates smooth transitions between two keyframes with viduq3-pro (1–16s). Billing follows Vidu's published Q3-pro per-second rates by resolution. Ready-to-use REST inference API on WaveSpeed.

image-to-videofrom $0.25

Q3 Turbo Start End To Video

Vidu Q3 Turbo Start-End-to-Video creates smooth transitions between two images with faster processing. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-videofrom $0.30

Q3 Start End To Video

Vidu Q3 Start End Image-to-Video turns text prompts into high-quality videos with exceptional visual fidelity and diverse motion. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-videofrom $0.35

Q3 Reference To Video

Vidu Q3 Reference-to-Video Mix generates multi-entity consistent videos from 1-4 reference images with text prompt guidance. Supports 360p to 1080p resolutions, up to 16 seconds duration, multiple aspect ratios, and optional audio generation. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-videofrom $0.35

Q3 Image To Video Pro

Vidu Q3 Image-to-Video Pro generates high-resolution videos (720p/1080p/2K/4K) from images with exceptional visual fidelity and diverse motion. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-videofrom $0.45

Q3 Pro Image To Video

Vidu Q3 Pro Image-to-Video animates still images with high-quality motion via viduq3-pro (1–16s). Billing follows Vidu's published Q3-pro per-second rates by resolution. Ready-to-use REST inference API on WaveSpeed.

image-to-videofrom $0.25

Q3 Turbo Image To Video

Vidu Q3 Turbo Image-to-Video animates static images with high-quality motion and faster processing. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-videofrom $0.30

Q3 Text To Video

Vidu Q3 Text-to-Video turns text prompts into high-quality videos with exceptional visual fidelity and diverse motion. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-videofrom $0.35

Q3 Drama

Vidu Q3 Drama generates complete script-driven drama videos from scripts and structured assets, including characters, scenes, tools, and references. It plans the narrative structure, scene pacing, and transitions to create a story-driven drama in one request, supporting up to 180 seconds. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-videofrom $1.12

Q3 Image To Video

Vidu Q3 Image-to-Video turns text prompts into high-quality videos with exceptional visual fidelity and diverse motion. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-videofrom $0.35

Q3 Ad

Vidu Q3 Ad Video generates commercial ad videos from 1 to 7 reference images with prompt guidance, supporting 720P / 1080P output and synchronized audio for product ads, brand campaigns, marketing creatives, and promotional videos. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-videofrom $0.15

Q3 Drama Clip

Vidu Q3 Drama Clip generates 8-12 second script-driven drama videos from structured assets, including characters, scenes, and tools. It is ideal for compact story scenes, storyboard shots, and focused narrative moments. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-videofrom $1.12

See Vidu Q3 in action

Real outputs generated by the Vidu Q3 API. Hover any video to preview, click to open the full-size viewer.

How to use the Vidu Q3 API

Four steps from signup to a finished generation. Full Python, Node.js, and cURL examples are in the API section below.

1
Get an API key
Sign up for a WaveSpeedAI account and copy your API key from the dashboard. New accounts come with free starter credits — enough to run the playground a few dozen times before billing kicks in.
2
Submit a prediction
POST your input as JSON to https://api.wavespeed.ai/api/v3/vidu/q3/text-to-video. The endpoint returns a prediction id immediately — generations are async so you don't hold an open connection during inference.
3
Poll for completion
GET https://api.wavespeed.ai/api/v3/predictions/{request_id}/result every 1-2 seconds. The response includes a status field; keep polling until it flips from"queued" or"processing" to"completed".
4
Read the output URL
Once status is"completed", read the URL from data.outputs[0]. The URL points to your generated media on the WaveSpeedAI CDN — image, video, audio, or 3D file depending on the Vidu Q3 variant you called.

What you can build with Vidu Q3

Common workflows developers and creators use the Vidu Q3 API for.

Reference-to-video with 1-4 references

vidu/q3/reference-to-video generates multi-entity consistent videos from 1-4 reference images with text-prompt guidance. Supports 360p-1080p, up to 16 seconds, multiple aspect ratios. Useful for ensemble scenes where multiple referenced subjects must stay coherent.

referencemulti-entityconsistency

Start-end keyframe interpolation

vidu/q3/start-end-to-video creates videos by interpolating between two keyframes. Pro and Turbo tiers also support start-end with 1-16s durations. Useful for animatic-style storyboarding where you have the start and end stills.

interpolationkeyframesanimatic

Up to 16-second outputs

Catalog claim on Pro and reference-to-video variants: 1-16 second duration. Longer than the 5-8s window of many competing video models — useful for full narrative beats in a single generation.

long-form16-secondsduration

High-resolution image-to-video

vidu/q3/image-to-video-pro supports 720p / 1080p / 2K / 4K from images. Useful for delivery-grade output without an upscaling pass when starting from a still.

i2v-pro4khigh-res

Pro tier (cheapest)

vidu/q3-pro/* is the cheapest Vidu Q3 tier — useful for high-volume work. Covers image-to-video, text-to-video, and start-end-to-video with 1-16 second outputs.

pro-tiercostvolume

Spicy variant for unlimited content

vidu/q3/image-to-video-spicy generates "unlimited high-quality videos from images with smooth animations and diverse motion, optimized for scalable content generation." Useful for high-volume i2v pipelines.

spicyscalableunlimited

Tips for prompting Vidu Q3

Practical advice for getting better outputs from Vidu Q3 — drawn from the patterns that work across video models in production pipelines.

Use the 1-4 reference-image support for multi-entity scenes

vidu/q3/reference-to-video accepts 1-4 reference images for multi-entity consistent videos with text-prompt guidance. Useful for ensemble cast scenes, group product showcases, and multi-subject storyboards.

Start-end frame interpolation for keyframe workflows

vidu/q3/start-end-to-video bridges two stills with generated motion. Particularly useful for animatic-style work, key-pose-driven storyboarding, and stitching concept art into motion without re-prompting each segment.

Pick the tier deliberately

Standard is the default delivery tier; Pro tier (1-16 second outputs) is positioned for high-volume work; Turbo tier prioritizes speed. Check the live pricing table on this page for the current per-tier cost — Vidu Q3's Pro tier is unusually competitive.

Up to 16-second outputs on some variants

Catalog claim on Pro and reference-to-video variants: 1-16 second duration. Longer than the 5-8s window of many competing video models — useful for full narrative beats in a single generation.

image-to-video-pro for high-res output

vidu/q3/image-to-video-pro supports 720p / 1080p / 2K / 4K resolution from images. Useful for delivery-grade output without an upscaling pass when starting from a still.

Spicy variant for unlimited content generation

vidu/q3/image-to-video-spicy is positioned for "unlimited high-quality videos from images with smooth animations and diverse motion, optimized for scalable content generation." Useful for high-volume i2v pipelines.

Vidu Q3 API pricing

Pricing is per-output. The final charge scales with the parameters you set in each variant's playground (resolution, duration, output count, references).

Endpoint	Type	Starting price
vidu/q3/image-to-video-spicy	image-to-video	$0.35
vidu/q3-pro/text-to-video	text-to-video	$0.25
vidu/q3-pro/start-end-to-video	image-to-video	$0.25
vidu/q3-turbo/start-end-to-video	image-to-video	$0.30
vidu/q3/start-end-to-video	image-to-video	$0.35
vidu/q3/reference-to-video	image-to-video	$0.35
vidu/q3/image-to-video-pro	image-to-video	$0.45
vidu/q3-pro/image-to-video	image-to-video	$0.25
vidu/q3-turbo/image-to-video	image-to-video	$0.30
vidu/q3/text-to-video	text-to-video	$0.35
vidu/q3/drama	image-to-video	$1.12
vidu/q3/image-to-video	image-to-video	$0.35
vidu/q3-ad	image-to-video	$0.15
vidu/q3/drama-clip	image-to-video	$1.12

Call the Vidu Q3 API

Sign up for an API key at wavespeed.ai/accesskey, then submit a prediction via REST. The playground generates ready-to-paste samples for any combination of inputs.

HTTP example

# 1. Submit a prediction
curl -X POST "https://api.wavespeed.ai/api/v3/vidu/q3/text-to-video" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY" \
  -d '{}'

# 2. Poll the result until status = "completed"
curl -X GET "https://api.wavespeed.ai/api/v3/predictions/{request_id}/result" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY"

# Read the output URL from data.outputs[0].

Node.js example

// npm install wavespeed
const WaveSpeed = require('wavespeed');
const client = new WaveSpeed(); // reads WAVESPEED_API_KEY

const result = await client.run("vidu/q3/text-to-video", {});
console.log(result.outputs[0]); // → URL of the generated output

Python example

# pip install wavespeed
import wavespeed

output = wavespeed.run(
    "vidu/q3/text-to-video",
    {}
)
print(output["outputs"][0])  # → URL of the generated output

Vidu Q3 vs alternatives

When to pick Vidu Q3 over similar models on WaveSpeedAI.

Vidu Q3 vs Seedance 2.0

Seedance 2.0 ships native audio synthesis across every variant and the Turbo tier. Vidu Q3 is meaningfully cheaper and ships start-end interpolation as a first-class endpoint that Seedance doesn't have.

Vidu Q3 vs Kling 3.0

Kling 3.0 covers Standard, Pro, and 4K with motion-control as a sub-endpoint. Vidu Q3 is cheaper for most tiers and ships start-end-to-video and reference-to-video (1-4 refs) as core variants.

Vidu Q3 vs Wan 2.7

Wan 2.7 ships reference-to-video, video-edit, video-extend, image-edit, and text-to-image in the same family. Vidu Q3 stays focused on the video-generation surface with cheaper tiers and the start-end interpolation workflow.

Vidu Q3 API — Frequently asked questions

Pricing, license, integration — common questions about running Vidu Q3 on WaveSpeedAI.

What is the Vidu Q3 API?

Vidu Q3 is a Shengshu video generation model exposed as a REST API on WaveSpeedAI. Shengshu Vidu Q3 — text-to-video, image-to-video, reference-to-video (1-4 reference images for multi-entity consistency), and start-end-to-video. Three tiers: Standard, Pro, Turbo. Up to 16-second outputs on some variants. You can call it programmatically or try it from the playground linked above.

How do I call the Vidu Q3 API?

Sign up for a WaveSpeedAI account, copy your API key from /accesskey, then POST to https://api.wavespeed.ai/api/v3/vidu/q3/text-to-video with your input as JSON. The endpoint returns a prediction id; poll the prediction endpoint until status flips to "completed", then read the output URL from data.outputs[0]. Full Python / Node.js / cURL examples are above.

How much does the Vidu Q3 API cost?

Vidu Q3 starts at $0.15 per run. The exact cost scales with the parameters you set (resolution, duration, output count, references). The live cost preview next to the Generate button in the playground shows the exact price for your current input.

Which Vidu Q3 variants are available?

WaveSpeedAI hosts 14 live Vidu Q3 endpoints: vidu/q3/image-to-video-spicy, vidu/q3-pro/text-to-video, vidu/q3-pro/start-end-to-video, vidu/q3-turbo/start-end-to-video, vidu/q3/start-end-to-video, vidu/q3/reference-to-video, vidu/q3/image-to-video-pro, vidu/q3-pro/image-to-video, and more. Each variant has its own playground page and pricing.

Can I use Vidu Q3 outputs commercially?

Commercial usage rights follow the Shengshu model license. Most Shengshu models permit commercial output use; see each model's playground page for the specific license summary, and WaveSpeedAI's Terms of Service for platform-level conditions.

Why use Vidu Q3 on WaveSpeedAI instead of going direct?

One API key + one billing account across Vidu Q3 AND 1,000+ other AI models from other providers. No per-vendor SDK setup, no separate rate-limit envelopes, no rewrite-per-vendor integration code. Pricing is typically at parity with or below Shengshu's direct API.

About Shengshu

The team behind Vidu Q3 and the broader Shengshu model lineup on WaveSpeedAI.

Shengshu Technology is a Chinese AI lab spun out of Tsinghua University, behind the Vidu family of video generation models. Vidu Q3 ships text-to-video, image-to-video, reference-to-video (1-4 reference images for multi-entity consistency), and start-end-to-video (keyframe interpolation between two stills) across Standard, Pro, and Turbo tiers. Some variants support up to 16-second outputs.

Related model APIs on WaveSpeedAI

Other AI APIs from Shengshu and the rest of the video model lineup — one API key, one billing account.

Seedance 2.5 API

ByteDance

ByteDance Seedance 2.5 API access is coming to WaveSpeedAI with planned 30-second single-shot video generation, support for up to 50 reference files, and more controllable video generation and editing. Track Seedance 2.5 release status here and test the current Seedance 2.0 API family today.

Seedance 2.0 Mini API

ByteDance

ByteDance Seedance 2.0 Mini — the faster, lower-cost tier of Seedance 2.0. Same cinematic multi-shot storytelling, AI camera control, and character consistency with native audio, at 50% of the standard price.

Seedance 2.0 API

ByteDance

ByteDance Seedance 2.0 — Hollywood-grade cinematic video with native audio-visual synchronization, director-level camera and lighting control, and exceptional motion stability. Built on Seed's unified multimodal architecture.

Seedance 1.5 Pro API

ByteDance

ByteDance Seedance 1.5 Pro — cinematic, live-action-leaning clips with strong prompt adherence, expressive motion, and stable aesthetics. 4-12s duration with Smart Duration, multiple aspect ratios, reproducible generation via seeds.

Veo 3.1 API

Google

Google Veo 3.1 — text-to-video with synchronized native audio at 1080p. Three tiers (Standard, Fast, Lite) with text-to-video, image-to-video, reference-to-video, and video-extend, plus start-end-to-video on the Lite tier.

Wan 2.7 API

Alibaba

Alibaba WAN 2.7 — coherent cinematic video with crisp detail, stable motion, and strong instruction-following. Separate endpoints for text-to-video, image-to-video, reference-to-video, video-edit, video-extend, plus image-edit and text-to-image variants in the same family.

Start building with Vidu Q3 on WaveSpeedAI

Free starter credits on signup. One API key across 1,000+ AI models from Shengshu and every other provider.

Open Vidu Q3 Playground →Get an API Key

Vidu Q3 API

About the Vidu Q3 API

All Vidu Q3 API endpoints

Q3 Image To Video Spicy

Q3 Pro Text To Video

Q3 Pro Start End To Video

Q3 Turbo Start End To Video

Q3 Start End To Video

Q3 Reference To Video

Q3 Image To Video Pro

Q3 Pro Image To Video

Q3 Turbo Image To Video

Q3 Text To Video

Q3 Drama

Q3 Image To Video

Q3 Ad

Q3 Drama Clip

See Vidu Q3 in action

How to use the Vidu Q3 API

Get an API key

Submit a prediction

Poll for completion

Read the output URL

What you can build with Vidu Q3

Reference-to-video with 1-4 references

Start-end keyframe interpolation

Up to 16-second outputs

High-resolution image-to-video

Pro tier (cheapest)

Spicy variant for unlimited content

Tips for prompting Vidu Q3

Use the 1-4 reference-image support for multi-entity scenes

Start-end frame interpolation for keyframe workflows

Pick the tier deliberately

Up to 16-second outputs on some variants

image-to-video-pro for high-res output

Spicy variant for unlimited content generation

Vidu Q3 API pricing

Call the Vidu Q3 API

Vidu Q3 vs alternatives

Vidu Q3 vs Seedance 2.0

Vidu Q3 vs Kling 3.0

Vidu Q3 vs Wan 2.7

Vidu Q3 API — Frequently asked questions

About Shengshu

Related model APIs on WaveSpeedAI

Seedance 2.5 API

Seedance 2.0 Mini API

Seedance 2.0 API

Seedance 1.5 Pro API

Veo 3.1 API

Wan 2.7 API

Start building with Vidu Q3 on WaveSpeedAI