Gemini 3.5 Pro Is Coming Next Month — What the Flash Release Already Tells Us

The day after the I/O 2026 keynote, the pre-keynote question of whether Google would ship “Gemini 3.5” or “Gemini 4.0” got answered. It’s 3.5. But the more interesting part is what Google actually launched and what it held back: Gemini 3.5 Flash shipped to general availability on May 19; Gemini 3.5 Pro is “coming next month.” Sundar Pichai’s exact line on stage: “Give us until next month to get it to you.”

The audience reportedly groaned. Reasonable response, but the gap is also more interesting than it looks. Flash already exceeds Gemini 3.1 Pro on the benchmarks that matter most to builders — and regresses on a specific set of reasoning and long-context benchmarks. Pro shipping a month later is almost certainly Google’s answer to that regression. Here’s what the Flash launch tells us about what Pro will actually be.

Confirmed: what Google said about 3.5 Pro

Google’s on-stage statements about Pro were minimal. The full set of confirmed facts:

Detail	Source	Status
Launches “next month” (June 2026)	Pichai keynote	Confirmed
Currently in internal testing	Pichai keynote	Confirmed
Will share Flash’s coding/agentic focus	I/O messaging	Confirmed
Specific benchmark numbers	—	Not disclosed
Pricing	—	Not disclosed
Context window	—	Not disclosed
Model ID	—	Not disclosed

That’s it. No benchmarks, no pricing, no model card. The Pro release is exactly one statement of intent and one timeline.

What the Flash data tells us about Pro

This is where it gets useful. Gemini 3.5 Flash launched the same day with full benchmarks, and the comparison against the previous-generation Gemini 3.1 Pro reveals exactly where the new generation is strong and where it’s weak.

Where Flash beats Gemini 3.1 Pro

Benchmark	3.5 Flash	3.1 Pro	Delta
Terminal-Bench 2.1	76.2%	70.3%	+5.9
MCP Atlas	83.6%	78.2%	+5.4
Finance Agent v2	57.9%	43.0%	+14.9
GDPval-AA	1656 Elo	1314 Elo	+342

These are all coding and agentic benchmarks — the categories where Claude has been the developer default. Flash is now closer to Claude on these than the previous Pro tier was. That’s a meaningful product change, not a marginal one.

Where Flash regresses vs Gemini 3.1 Pro

Benchmark	3.5 Flash	3.1 Pro	Delta
Humanity’s Last Exam	40.2%	44.4%	−4.2
ARC-AGI-2	72.1%	77.1%	−5.0
Long-context (128K)	77.3%	84.9%	−7.6

These three are the exact benchmarks where you’d expect a Pro tier to differentiate. Hard reasoning. Abstract pattern matching. Long-context retrieval. The first two stress depth; the third stresses recall at scale. Flash dropping 4-8 points on each tells you the Flash architecture made deliberate trade-offs to ship the speed and cost numbers.

The 3.5 Pro launch in June is almost certainly Google’s answer to this exact list. Pro’s reason to exist is to restore the reasoning and long-context lead that Flash gave up. If Pro lands above 3.1 Pro on Humanity’s Last Exam and matches Flash on Terminal-Bench, it’s the strongest production frontier model. If it only fixes the regression at the cost of agentic speed, it’s a different positioning.

What Flash pricing implies for Pro

Flash launched at $1.50 input / $9.00 output per 1M tokens on the standard tier — 40% cheaper than Gemini 3.1 Pro on both axes. Cached input is $0.15/1M, which is the headline number for retrieval-heavy workloads.

The straightforward read on Pro pricing:

If Pro launches at Gemini 3.1 Pro pricing or above (~$2.50/$15/1M or higher), it’s signaling that Pro is meant as a premium reasoning tier rather than a Flash replacement.
If Pro launches below 3.1 Pro pricing but above Flash, it’s positioned as the default “smarter Flash” — same product surface, higher capability, modest premium.
If Pro matches Flash pricing, that would be unusual and would put Flash in the same awkward position Seedance 2.0 Fast is currently in (see our Seedance 2.1 / Mini preview for the analogous tier-collision problem).

The first option is the most likely. Google is making a structural bet that customers will pay for the reasoning-tier separation. The audience groan suggests the market thinks Flash is good enough and Pro is unnecessary; we won’t know if the market is right until builders run their own evals against the June model card.

Other things to watch in June

When the Pro model card drops, four specifics matter:

Does Pro match Flash on coding (Terminal-Bench, MCP Atlas)? If yes, Pro is a strict superset. If no, you’ll be running two endpoints — Flash for agents, Pro for reasoning — and the integration cost goes up.
Long-context numbers. If Pro restores the Gemini 3.1 Pro lead at 128K and extends to the same 1M-token context window Flash ships with, that’s the most production-relevant signal. RAG-heavy workloads should plan their migration on this number specifically.
Multimodal claims. Flash launched with the same image/video understanding as the 3.0 line. If Pro ships with the Gemini Omni video-generation integration (still rumored as of May 20), that’s a unification story Google can’t yet tell.
Whether Pro is a thinking model. Google’s recent reasoning models have shipped with optional “thinking” modes that trade latency for accuracy. If 3.5 Pro defaults to thinking-on or exposes per-request control, that materially affects how you’d use it in production.

What to do this month

While Pro is in internal testing, three concrete moves:

Run your evals against 3.5 Flash this week. It’s live on the Gemini API, Google AI Studio, Vertex, Antigravity, and the Gemini app under model ID gemini-3.5-flash. If Flash already covers your workload, you may not need Pro at all.
For long-context or hard-reasoning workloads, stay on Gemini 3.1 Pro for now. Don’t migrate downward to Flash just because it’s the newest model — the 7.6-point regression at 128K is real. Wait for Pro.
Set up your June A/B test now. Define the Flash → Pro comparison eval before Pro lands. The temptation to switch on launch day is real; the value of a held-out benchmark you’ve already run against Flash and 3.1 Pro is realer.

Until Pro ships

For LLM-side workloads, the WaveSpeedAI LLM endpoint gives you OpenAI-compatible access to the current frontier text models behind a single API key. When Gemini 3.5 Pro lands in June, expect to compare it under that same endpoint within days — alongside Flash and the rest of the frontier text lineup.

Sources: MacRumors I/O 2026 roundup, LLM Stats on Gemini 3.5 Flash, Felloai Gemini 3.5 review, BusinessToday on Gemini Spark and 3.5.