What Is Omni Flash? Capabilities, Access & Builder Guide
Google's Omni Flash launches video generation in Gemini App and Flow. What builders need to know about access, limits, and the API timeline.
Hi, I’m Dora. I spent the morning of I/O 2026 reading the rollout post and pricing pages, then opening the Gemini app to see what actually shipped versus what’s still labeled “coming weeks.” This is the notes version of that — for builders and product teams deciding whether Omni Flash changes anything in their pipeline.
Short version up top: the model is real and live in consumer surfaces today. The developer API is not. That gap matters more than the demos.
What Omni Flash Actually Is (Google’s first Omni-series model)
So what is Omni Flash, concretely? Gemini Omni Flash is the first model in Google’s new Omni family, announced May 19, 2026 at I/O. Google DeepMind frames it as “create anything from any input — starting with video.” The “starting with” matters — the long-term roadmap covers any-to-any modality routing, but what shipped today is multimodal input producing video output. Image and audio output are on the public roadmap, not in the product.
Position in the Gemini Omni roadmap
Omni is positioned as a family, not a single model. Flash is the consumer-grade first step. A higher-tier Omni Pro has been confirmed by Google DeepMind, with no release date — Nicole Brichtova told TechCrunch that Pro arrives “when we feel like we’re at a point where we have a step change above Flash.” Read that as: not soon.
Why Google describes it as “video version of Nano Banana”
Nano Banana — Google’s image generation/editing model launched in 2025 — set the template for what Omni is trying to be for video: conversational editing, identity persistence across iterations, low friction for non-technical users. The official Google blog post introducing Omni draws the lineage explicitly. Architecturally, this Google DeepMind Omni Flash release reasons across modalities in a single forward pass rather than relaying between specialized systems. Whether that translates to meaningfully better outputs versus a Veo-plus-audio-pipeline approach is something I’ll watch. The demos are curated. Real workflows aren’t.
Capabilities Confirmed at Launch
These are the Omni Flash capabilities confirmed in the product today, not what was teased.
Multimodal input (text, image, video, audio)
You can combine any of these as inputs in a single prompt. The model treats them as a unified scene description rather than concatenated assets. This is the cleanest part of the announcement — and what distinguishes it from Veo’s text-to-video pipeline.
Up to 10-second video output with native audio
Clips cap at 10 seconds. Brichtova described this as a deployment decision, not a model ceiling — a way to control compute demand while access widens. Audio generates synchronized with video, not bolted on after. The marble-bouncing demo Google’s CTO Koray Kavukcuoglu showed reporters produced impact sounds and ring sounds automatically. Worth flagging: independent testers told TechTimes that raw generation quality may trail ByteDance’s Seedance 2.0 and Alibaba’s Wan 2.7, even if the editing layer is stronger.
Conversational editing via natural language
Each instruction builds on the last. “Make the sculpture out of bubbles” — applied, state preserved, next instruction operates on the new state. This is the workflow shift, and the part most likely to save time in production: fewer prompt rewrites, fewer re-runs from scratch.
Likeness insertion and scene consistency
The Avatar feature lets you create a digital version of yourself (onboarding requires speaking a sequence of numbers on camera — a deepfake check borrowed loosely from OpenAI’s discontinued Sora Cameos). Once stored, the avatar persists across generations.
SynthID watermarking and safety constraints
Every output carries an invisible SynthID watermark, verifiable via the Gemini app, Chrome, and Google Search. SynthID has now marked over 100 billion AI-generated images and videos. Open editing of voice and likeness is held back — Google’s stated reason is responsible deployment.
Where You Can Access It Today
Three surfaces, different ceilings.
| Surface | Who gets it | Compute budget |
|---|---|---|
| Gemini App | AI Plus, Pro, Ultra subscribers globally | Compute-based weekly limits (new model) |
| Google Flow | AI Plus / Pro / Ultra | 200 / 1,000 / 10,000–25,000 Flow credits per month |
| YouTube Shorts & Create App | Free users | Rolling out this week |
Gemini App (free tier limits)
Free users don’t get the model in the Gemini app. The free entry point is YouTube. Paid tiers start at AI Plus ($7.99/month).
Google Flow (Pro/Ultra credit allocations)
Flow is where the real workflow surfaces live — multi-clip composition, ingredient libraries, custom voices, edit-on-existing-video. The Google Flow support documentation lists features exclusive to this model: 10-second clips (vs 4s/6s/8s on lower models), uploaded-video editing, custom voice creation. Per-action credit costs vary by clip length and edit type — I’ll cover credit economics in a separate piece. For this brief, 200 credits (Plus) is exploratory; serious iteration needs Pro or higher.
YouTube Shorts and YouTube Create
The surprise distribution play. Free access to a frontier model — even constrained — is unusual. The strategic logic: OpenAI pulled Sora back to API-only earlier in 2026, leaving the consumer video space less crowded. Google is filling it with reach rather than peak quality.
What’s Not Yet Available
Developer API on Vertex AI (announced, not GA)
As of May 2026, the developer API is not generally available. Google’s blog says rollout to developers and enterprise customers via APIs is coming “in the coming weeks.” VentureBeat’s enterprise breakdown puts it directly: until Vertex API is GA, Omni is effectively a consumer and prosumer tool. If you’re scoping an integration, treat the API as a Q3 2026 planning item, not a current option.
Longer-duration generation
10 seconds is the public ceiling. Google says longer durations are in the pipeline. No timeline.
Open editing of voice and likeness
You can use your own avatar. You cannot freely edit arbitrary voices or likenesses in uploaded videos. This is a deliberate safety boundary, not a capability gap.
A few other things circulating in launch coverage that Google has not officially confirmed: a 720p output cap, 60–90 second generation times, named avatar template packs. Treat those as unverified.
How It Sits in the Video Generation Landscape
Replacement of Veo in some product surfaces
Multiple outlets have reported that Google Omni Flash effectively replaces Veo in Flow and the Gemini app. Veo is not deprecated — Veo 3.1 still has API access, and for pure text-to-video at API-grade reliability, it’s the production option today. But within Google’s own consumer surfaces, Omni is reportedly the new default. The migration story Google is selling: ship with Veo now, plan the move when GA arrives.
Conversational editing vs prompt-only generation
This is the architectural bet. Most current video models — Veo included — treat each generation as a new pass. Omni’s edits are stateful. For workflows that involve iteration (most professional ones), that changes the math on credit-per-final-clip. Whether the math actually works depends on how well the model preserves intent across edits. I haven’t tested it long enough to say.
What Builders and Product Teams Should Watch
API timing and pricing signals
The developer API is the gating factor for any production integration. Two things to monitor: the Gemini API documentation for the actual SKU appearing, and the Vertex AI pricing page for per-token or per-second billing structure. Token-based pricing — Google’s standard for the Gemini family — would make this easier to forecast than per-clip pricing.
Likely arrival on aggregation platforms
Once the API lands, expect the model to show up on unified-access platforms within weeks. If you’re already integrated against a multi-model API layer, migration cost from Veo 3.1 should be small. If you’re directly integrated to a single provider, the case for adding an aggregation layer gets stronger every quarter — this launch is one more data point in that direction.
FAQ
Is the Omni Flash API available for developers yet?
No. As of May 2026, the developer API is not generally available. Google says rollout via Gemini API and Vertex AI is coming “in the coming weeks.” Until then, programmatic access is not possible.
What’s the maximum video length Omni Flash can generate?
10 seconds. Google DeepMind has stated this is a deployment decision rather than a model architectural limit. Longer durations are planned without a public timeline.
Does Omni Flash replace Google’s Veo model entirely?
No. Veo 3.1 remains available with API access for text-to-video workloads. Within Google’s own consumer surfaces (Gemini app, Flow), the new model is reportedly the default. For production API integrations today, Veo is the working option.
Can I use Omni Flash output commercially?
Subject to Google’s Generative AI Prohibited Use Policy and your subscription tier terms. Commercial use is generally permitted within paid tiers, but specific scenarios (likeness-bearing content, third-party IP, regulated industries) need verification against current Google policy. Don’t take a blanket yes from anyone.
Does Omni Flash watermark every generated video?
Yes. All outputs carry an imperceptible SynthID watermark, verifiable through the Gemini app, Chrome, and Google Search. There is no opt-out.
Is Omni Flash available outside Google’s own apps?
Not yet. Current access is limited to the Gemini app, Google Flow, YouTube Shorts, and the YouTube Create app. Once the developer API ships, expect availability through Vertex AI and likely on third-party aggregation platforms shortly after.
Bottom Line
For most product teams, the practical answer this week is: nothing changes yet. Keep shipping with Veo 3.1. The decision point is the API GA — when it lands, the conversational-editing primitive is worth a real evaluation, especially if your pipeline already pays the cost of multi-pass video generation.
For consumer experimentation, Gemini app and Flow are the entry points on paid tiers; YouTube Shorts is the free path. Worth half an hour of hands-on time to calibrate your own quality expectations against the demos.
One disambiguation note: this is Google’s Gemini Omni Flash. There’s a separately named Qwen3.5-Omni-Flash from Alibaba — different vendor, different roadmap. Don’t conflate them.
That’s what I have today. I’ll revisit when the API ships.
Previous posts: