Kuaishou·video·From $0.42/run

Kling Omni O3 API

Kuaishou Kling Omni Video O3 — advanced unified multi-modal video model with MVL (Multi-modal Visual Language) technology. Standard, Pro, and 4K tiers for text-to-video, image-to-video, reference-to-video, and conversational video-edit.

Standard (kwaivgi/kling-video-o3-std/*), Pro (kwaivgi/kling-video-o3-pro/*), and 4K (kwaivgi/kling-video-o3-4k/*) tiers. MVL technology maintains subject consistency across modalities. Video-edit accepts natural-language commands to remove objects, swap backgrounds, restyle scenes, and apply localized 3-10s transforms.

Open Playground →View API Docs

About the Kling Omni O3 API

What Kling Omni O3 does, how it fits in the Kuaishou model lineup, and why teams reach for it.

Kling Omni O3 is a video generation model from Kuaishou, available through the WaveSpeedAI REST API. Kuaishou Kling Omni Video O3 — advanced unified multi-modal video model with MVL (Multi-modal Visual Language) technology. Standard, Pro, and 4K tiers for text-to-video, image-to-video, reference-to-video, and conversational video-edit.

Standard (kwaivgi/kling-video-o3-std/*), Pro (kwaivgi/kling-video-o3-pro/*), and 4K (kwaivgi/kling-video-o3-4k/*) tiers. MVL technology maintains subject consistency across modalities. Video-edit accepts natural-language commands to remove objects, swap backgrounds, restyle scenes, and apply localized 3-10s transforms.

The Kling Omni O3 family on WaveSpeedAI ships 11 REST endpoints covering Video-To-Video, Text-To-Video, Image-To-Video workflows. Each variant carries its own pricing, parameter knobs, and example outputs — pick the one that matches your input modality and production constraints, or call several from the same API key to compose multi-step pipelines.

Run Kling Omni O3 through the same API key, billing account, and rate-limit envelope you use for the other 1,000+ AI models on WaveSpeedAI. No separate vendor setup, no per-provider SDKs, no per-vendor rate-limit envelopes — one integration covers everything from text-to-image and text-to-video through audio synthesis, 3D generation, upscaling, and editing.

All Kling Omni O3 API endpoints

11 Kling Omni O3 endpoints available now on WaveSpeedAI — pick the variant that matches your workflow.

Kling Video O3 Std Video Edit

Kling Omni Video O3 Video-Edit (Standard) enables natural-language video edits: remove or replace objects, swap backgrounds, restyle scenes, change weather/lighting, and apply localized 3-10s transformations with strong temporal consistency. Built for stable production use with a ready-to-use REST API, no cold starts, and predictable pricing.

video-to-videofrom $0.63

Kling Video O3 Pro Video Edit

Kling Omni Video O3 Video-Edit enables conversational video editing through natural language commands. Remove objects, change backgrounds, modify styles, adjust weather/lighting, and transform scenes with simple text instructions like 'remove pedestrians' or 'change daytime to dusk'. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

video-to-videofrom $0.84

Kling Video O3 Std Text To Video

Kling Omni Video O3 (Standard) is Kuaishou's advanced unified multi-modal video model with MVL (Multi-modal Visual Language) technology. Text-to-Video mode generates cinematic videos from text prompts with subject consistency, natural physics simulation, and precise semantic understanding. Supports audio generation. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

text-to-videofrom $0.42

Kling Video O3 4k Text To Video

Kling Video O3 4K generates cinematic 4K videos from text prompts with subject consistency, natural physics simulation, and precise semantic understanding. Supports multi-prompt scene transitions, element references, and optional audio generation. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

text-to-videofrom $2.10

Kling Video O3 Pro Text To Video

Kling Omni Video O3 is Kuaishou's advanced unified multi-modal video model with MVL (Multi-modal Visual Language) technology. Text-to-Video mode generates cinematic videos from text prompts with subject consistency, natural physics simulation, and precise semantic understanding. Supports audio generation. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

text-to-videofrom $0.56

Kling Video O3 Std Reference To Video

Kling Omni Video O3 (Standard) Reference-to-Video generates creative videos using character, prop, or scene references from multiple viewpoints. Extracts subject features and creates new video content while maintaining identity consistency across frames. Supports audio generation. Ready-to-use REST API, best performance, no cold starts, affordable pricing.

image-to-videofrom $0.42

Kling Video O3 4k Reference To Video

Kling Video O3 4K Reference-to-Video generates creative 4K videos using character, prop, or scene references from multiple viewpoints. Extracts subject features and creates new video content while maintaining identity consistency across frames. Supports multi-reference images, video guidance, and optional audio generation. Ready-to-use REST API, best performance, no cold starts, affordable pricing.

image-to-videofrom $2.10

Kling Video O3 Pro Reference To Video

Kling Omni Video O3 Reference-to-Video generates creative videos using character, prop, or scene references from multiple viewpoints. Extracts subject features and creates new video content while maintaining identity consistency across frames. Supports audio generation. Ready-to-use REST API, best performance, no cold starts, affordable pricing.

image-to-videofrom $0.56

Kling Video O3 4k Image To Video

Kling Video O3 4K Image-to-Video transforms static images into dynamic cinematic 4K videos. Maintains subject consistency while adding natural motion, physics simulation, and seamless scene dynamics. Supports start/end frame control, multi-prompt, and optional audio generation. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

image-to-videofrom $2.10

Kling Video O3 Pro Image To Video

Kling Omni Video O3 Image-to-Video transforms static images into dynamic cinematic videos using MVL (Multi-modal Visual Language) technology. Maintains subject consistency while adding natural motion, physics simulation, and seamless scene dynamics. Supports audio generation. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

image-to-videofrom $0.56

Kling Video O3 Std Image To Video

Kling Omni Video O3 (Standard) Image-to-Video transforms static images into dynamic cinematic videos using MVL (Multi-modal Visual Language) technology. Maintains subject consistency while adding natural motion, physics simulation, and seamless scene dynamics. Supports audio generation. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

image-to-videofrom $0.42

See Kling Omni O3 in action

Real outputs generated by the Kling Omni O3 API. Hover any video to preview, click to open the full-size viewer.

How to use the Kling Omni O3 API

Four steps from signup to a finished generation. Full Python, Node.js, and cURL examples are in the API section below.

1
Get an API key
Sign up for a WaveSpeedAI account and copy your API key from the dashboard. New accounts come with free starter credits — enough to run the playground a few dozen times before billing kicks in.
2
Submit a prediction
POST your input as JSON to https://api.wavespeed.ai/api/v3/kwaivgi/kling-video-o3-std/text-to-video. The endpoint returns a prediction id immediately — generations are async so you don't hold an open connection during inference.
3
Poll for completion
GET https://api.wavespeed.ai/api/v3/predictions/{request_id}/result. Start around every 2 seconds, then increase toward 5-10 seconds for long-running tasks to reduce unnecessary requests. Stop on completed, failed, cancelled, or timeout.
4
Read the output URL
Once status is"completed", read the URL from data.outputs[0]. The URL points to your generated media on the WaveSpeedAI CDN — image, video, audio, or 3D file depending on the Kling Omni O3 variant you called.

What you can build with Kling Omni O3

Common workflows developers and creators use the Kling Omni O3 API for.

Unified text-to-video with MVL

kwaivgi/kling-video-o3-std/text-to-video generates cinematic videos from text prompts using Kling Omni's MVL (Multi-modal Visual Language) technology — Kuaishou's advanced unified multi-modal architecture.

text-to-videomvlcinematic

Image-to-video with subject consistency

kwaivgi/kling-video-o3-std/image-to-video transforms static images into dynamic cinematic videos while maintaining subject consistency and adding natural motion — the i2v variant when starting from a key still.

image-to-videoconsistencymotion

Reference-to-video from multiple viewpoints

kwaivgi/kling-video-o3-std/reference-to-video generates creative videos using character, prop, or scene references from multiple viewpoints — extracts subject features and creates new content while maintaining identity.

referencemulti-viewidentity

Conversational video-edit

kwaivgi/kling-video-o3-std/video-edit enables natural-language video edits: remove or replace objects, swap backgrounds, restyle scenes, change weather/lighting, and apply localized 3-10s transforms — edit via prompt rather than manual masking.

video-editnlprestyle

Pro tier for top-tier output

kwaivgi/kling-video-o3-pro/* mirrors the Standard variant surface (text-to-video, image-to-video, reference-to-video, video-edit) at Pro quality — same prompt format, switch the endpoint tier for delivery work.

pro-tierqualitydelivery

4K tier for delivery resolution

kwaivgi/kling-video-o3-4k/* covers text-to-video, image-to-video, and reference-to-video at 4K delivery resolution — the top tier when output resolution is the limiting factor.

4kdeliveryresolution

Tips for prompting Kling Omni O3

Practical advice for getting better outputs from Kling Omni O3 — drawn from the patterns that work across video models in production pipelines.

Pick Standard vs Pro by delivery needs

kwaivgi/kling-video-o3-std/* for iteration and default delivery; kwaivgi/kling-video-o3-pro/* when Pro quality matters. Same variant names across tiers — only the tier prefix changes.

Use reference-to-video for identity-critical shots

When characters, props, or scenes must stay recognizable, reference-to-video extracts subject features from multiple viewpoints — stronger than text-only conditioning for serialized content.

Video-edit via natural language

Describe edits as conversational commands — remove object, swap background, change weather — rather than supplying masks. O3 video-edit handles localized 3-10s transforms from prompt alone.

Match variant to input type

text-to-video for greenfield; image-to-video for animating a still; reference-to-video for identity; video-edit for modifying existing footage. Pick the endpoint that matches your source material.

Kling Omni O3 API pricing

Pricing is per-output. The final charge scales with the parameters you set in each variant's playground (resolution, duration, output count, references).

Endpoint	Type	Starting price
kwaivgi/kling-video-o3-std/video-edit	video-to-video	$0.63
kwaivgi/kling-video-o3-pro/video-edit	video-to-video	$0.84
kwaivgi/kling-video-o3-std/text-to-video	text-to-video	$0.42
kwaivgi/kling-video-o3-4k/text-to-video	text-to-video	$2.10
kwaivgi/kling-video-o3-pro/text-to-video	text-to-video	$0.56
kwaivgi/kling-video-o3-std/reference-to-video	image-to-video	$0.42
kwaivgi/kling-video-o3-4k/reference-to-video	image-to-video	$2.10
kwaivgi/kling-video-o3-pro/reference-to-video	image-to-video	$0.56
kwaivgi/kling-video-o3-4k/image-to-video	image-to-video	$2.10
kwaivgi/kling-video-o3-pro/image-to-video	image-to-video	$0.56
kwaivgi/kling-video-o3-std/image-to-video	image-to-video	$0.42

Call the Kling Omni O3 API

Sign up for an API key at wavespeed.ai/accesskey, then submit a prediction via REST. The playground generates ready-to-paste samples for any combination of inputs.

HTTP example

# 1. Submit the prediction.
SUBMIT_RESPONSE=$(curl --silent --show-error --fail-with-body \
  -X POST "https://api.wavespeed.ai/api/v3/kwaivgi/kling-video-o3-std/text-to-video" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY" \
  -d '{}')

TASK=$(printf '%s' "$SUBMIT_RESPONSE" | jq 'if has("data") then .data else . end')
PREDICTION_ID=$(printf '%s' "$TASK" | jq -r '.id')
if [ -z "$PREDICTION_ID" ] || [ "$PREDICTION_ID" = "null" ]; then
  printf 'Submission response did not contain a prediction id
' >&2
  exit 1
fi
RESULT_URL=$(printf '%s' "$TASK" | jq -r '.urls.get // empty')
if [ -z "$RESULT_URL" ]; then
  RESULT_URL="https://api.wavespeed.ai/api/v3/predictions/$PREDICTION_ID/result"
fi

# 2. Poll until the prediction finishes.
while true; do
  RESPONSE=$(curl --silent --show-error --fail-with-body "$RESULT_URL" \
    -H "Authorization: Bearer $WAVESPEED_API_KEY")
  RESULT=$(printf '%s' "$RESPONSE" | jq 'if has("data") then .data else . end')
  STATUS=$(printf '%s' "$RESULT" | jq -r '.status')
  case "$STATUS" in
    completed) printf '%s\n' "$RESULT" | jq '.outputs'; break ;;
    failed|cancelled|timeout) printf '%s\n' "$RESULT" | jq . >&2; exit 1 ;;
    created|processing) sleep 2 ;;
    *) printf 'Unexpected status: %s
' "$STATUS" >&2; exit 1 ;;
  esac
done

Node.js example

const submitUrl = "https://api.wavespeed.ai/api/v3/kwaivgi/kling-video-o3-std/text-to-video";
const apiKey = process.env.WAVESPEED_API_KEY;
if (!apiKey) throw new Error('Set WAVESPEED_API_KEY');

async function requestJson(url, options = {}) {
  const response = await fetch(url, options);
  if (!response.ok) throw new Error(await response.text());
  return response.json();
}

// 1. Submit the prediction.
const body = await requestJson(submitUrl, {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${apiKey}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({}),
});
const task = body.data ?? body;
const resultUrl = task.urls?.get ||
  `https://api.wavespeed.ai/api/v3/predictions/${task.id}/result`;

// 2. Poll until the prediction finishes.
while (true) {
  const resultBody = await requestJson(resultUrl, {
    headers: { "Authorization": `Bearer ${apiKey}` },
  });
  const result = resultBody.data ?? resultBody;
  if (result.status === "completed") {
    console.log(result.outputs);
    break;
  }
  if (["failed", "cancelled", "timeout"].includes(result.status)) throw new Error(JSON.stringify(result));
  if (!["created", "processing"].includes(result.status)) throw new Error("Unexpected status: " + result.status);
  await new Promise(resolve => setTimeout(resolve, 2000));
}

Python example

import json
import os
import time
from urllib.request import Request, urlopen

api_key = os.environ["WAVESPEED_API_KEY"]
headers = {"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"}
payload = {}

def request_json(url, data=None):
    request = Request(url, data=data, headers=headers, method="POST" if data else "GET")
    with urlopen(request) as response:
        return json.load(response)

# 1. Submit the prediction.
body = request_json("https://api.wavespeed.ai/api/v3/kwaivgi/kling-video-o3-std/text-to-video", json.dumps(payload).encode())
task = body.get("data", body)
result_url = task.get("urls", {}).get("get") or f"https://api.wavespeed.ai/api/v3/predictions/{task['id']}/result"

# 2. Poll until the prediction finishes.
while True:
    result_body = request_json(result_url)
    result = result_body.get("data", result_body)
    status = result.get("status")
    if status == "completed":
        print(result.get("outputs", []))
        break
    if status in {"failed", "cancelled", "timeout"}:
        raise RuntimeError(result)
    if status not in {"created", "processing"}:
        raise RuntimeError(f"Unexpected status: {status}")
    time.sleep(2)

Kling Omni O3 vs alternatives

When to pick Kling Omni O3 over similar models on WaveSpeedAI.

Kling Omni O3 vs Kling 3.0

Kling 3.0 ships Standard, Pro, and 4K tiers with native audio and a dedicated motion-control endpoint. Kling O3 is the newer Omni architecture with MVL technology and conversational video-edit — broader edit surface, no separate 4K or motion-control tier in the O3 family.

Kling Omni O3 vs Kling Omni O1

Kling O1 is the first Omni unified model with the same variant pattern (text-to-video, i2v, reference-to-video, video-edit). O3 is the advanced successor with improved MVL technology — pick O3 for new projects unless O1 availability or pricing fits better.

Kling Omni O3 vs Wan 2.7

Wan 2.7 ships reference-to-video, video-edit, video-extend, image-edit, and text-to-image in one family at lower cost. Kling O3 stays focused on the Omni video surface with MVL conditioning and conversational edit commands.

Kling Omni O3 API — Frequently asked questions

Pricing, license, integration — common questions about running Kling Omni O3 on WaveSpeedAI.

What is the Kling Omni O3 API?

Kling Omni O3 is a Kuaishou video generation model exposed as a REST API on WaveSpeedAI. Kuaishou Kling Omni Video O3 — advanced unified multi-modal video model with MVL (Multi-modal Visual Language) technology. Standard, Pro, and 4K tiers for text-to-video, image-to-video, reference-to-video, and conversational video-edit. You can call it programmatically or try it from the playground linked above.

How do I call the Kling Omni O3 API?

Sign up for a WaveSpeedAI account, copy your API key from /accesskey, then POST to https://api.wavespeed.ai/api/v3/kwaivgi/kling-video-o3-std/text-to-video with your input as JSON. The endpoint returns a prediction id. Poll the result endpoint starting around every 2 seconds, increase the interval for long-running tasks, and stop on any terminal status. Production-oriented Python / Node.js / cURL examples are above.

How much does the Kling Omni O3 API cost?

Kling Omni O3 starts at $0.42 per run. The exact cost scales with the parameters you set (resolution, duration, output count, references). The live cost preview next to the Generate button in the playground shows the exact price for your current input.

Which Kling Omni O3 variants are available?

WaveSpeedAI hosts 11 live Kling Omni O3 endpoints: kwaivgi/kling-video-o3-std/video-edit, kwaivgi/kling-video-o3-pro/video-edit, kwaivgi/kling-video-o3-std/text-to-video, kwaivgi/kling-video-o3-4k/text-to-video, kwaivgi/kling-video-o3-pro/text-to-video, kwaivgi/kling-video-o3-std/reference-to-video, kwaivgi/kling-video-o3-4k/reference-to-video, kwaivgi/kling-video-o3-pro/reference-to-video, and more. Each variant has its own playground page and pricing.

Can I use Kling Omni O3 outputs commercially?

Commercial usage rights follow the Kuaishou model license. Most Kuaishou models permit commercial output use; see each model's playground page for the specific license summary, and WaveSpeedAI's Terms of Service for platform-level conditions.

Why use Kling Omni O3 on WaveSpeedAI instead of going direct?

One API key + one billing account across Kling Omni O3 AND 1,000+ other AI models from other providers. No per-vendor SDK setup, no separate rate-limit envelopes, no rewrite-per-vendor integration code. Pricing is typically at parity with or below Kuaishou's direct API.

About Kuaishou

The team behind Kling Omni O3 and the broader Kuaishou model lineup on WaveSpeedAI.

Kuaishou is a major Chinese short-video platform and the team behind the Kling family of video generation models. Kling 3.0 ships Standard, Pro, and 4K tiers with native audio synthesis (a sound parameter on every variant), plus a dedicated motion-control endpoint that transfers motion from a reference video to animate a still character image.

Related model APIs on WaveSpeedAI

Other AI APIs from Kuaishou and the rest of the video model lineup — one API key, one billing account.

Kling 3.0 API

Kuaishou

Kuaishou Kling 3.0 — text-to-video and image-to-video with smooth motion, cinematic visuals, accurate prompt adherence, and native audio. Three tiers: Standard, Pro, and 4K.

Kling 3.0 Motion Control API

Kuaishou

Kuaishou Kling 3.0 Motion Control — transfers motion from a reference video to animate a still character image. Upload a character image plus a motion clip (dance, action, gesture); the model extracts the movement to generate smooth, realistic video. Standard and Pro tiers.

Seedance 2.5 API

ByteDance

ByteDance Seedance 2.5 API access is coming to WaveSpeedAI with planned 30-second single-shot video generation, support for up to 50 reference files, and more controllable video generation and editing. Track Seedance 2.5 release status here and test the current Seedance 2.0 API family today.

Seedance 2.0 Mini API

ByteDance

ByteDance Seedance 2.0 Mini — the faster, lower-cost tier of Seedance 2.0. Same cinematic multi-shot storytelling, AI camera control, and character consistency with native audio, at 50% of the standard price.

Seedance 2.0 API

ByteDance

ByteDance Seedance 2.0 — Hollywood-grade cinematic video with native audio-visual synchronization, director-level camera and lighting control, and exceptional motion stability. Built on Seed's unified multimodal architecture.

Seedance 1.5 Pro API

ByteDance

ByteDance Seedance 1.5 Pro — cinematic, live-action-leaning clips with strong prompt adherence, expressive motion, and stable aesthetics. 4-12s duration with Smart Duration, multiple aspect ratios, reproducible generation via seeds.

Start building with Kling Omni O3 on WaveSpeedAI

Free starter credits on signup. One API key across 1,000+ AI models from Kuaishou and every other provider.

Open Kling Omni O3 Playground →Get an API Key

Kling Omni O3 API

About the Kling Omni O3 API

All Kling Omni O3 API endpoints

Kling Video O3 Std Video Edit

Kling Video O3 Pro Video Edit

Kling Video O3 Std Text To Video

Kling Video O3 4k Text To Video

Kling Video O3 Pro Text To Video

Kling Video O3 Std Reference To Video

Kling Video O3 4k Reference To Video

Kling Video O3 Pro Reference To Video

Kling Video O3 4k Image To Video

Kling Video O3 Pro Image To Video

Kling Video O3 Std Image To Video

See Kling Omni O3 in action

How to use the Kling Omni O3 API

Get an API key

Submit a prediction

Poll for completion

Read the output URL

What you can build with Kling Omni O3

Unified text-to-video with MVL

Image-to-video with subject consistency

Reference-to-video from multiple viewpoints

Conversational video-edit

Pro tier for top-tier output

4K tier for delivery resolution

Multi-modal visual language conditioning

Tips for prompting Kling Omni O3

Pick Standard vs Pro by delivery needs

Use reference-to-video for identity-critical shots

Video-edit via natural language

MVL benefits from multi-modal inputs

Match variant to input type

Kling Omni O3 API pricing

Call the Kling Omni O3 API

Kling Omni O3 vs alternatives

Kling Omni O3 vs Kling 3.0

Kling Omni O3 vs Kling Omni O1

Kling Omni O3 vs Wan 2.7

Kling Omni O3 API — Frequently asked questions

About Kuaishou

Related model APIs on WaveSpeedAI

Kling 3.0 API

Kling 3.0 Motion Control API

Seedance 2.5 API

Seedance 2.0 Mini API

Seedance 2.0 API

Seedance 1.5 Pro API

Start building with Kling Omni O3 on WaveSpeedAI