Vidu Contest
WaveSpeed.ai
Accueil/Explorer/Flux Image Tools/wavespeed-ai/flux-kontext-max/text-to-image
text-to-image

text-to-image

Flux Kontext Max

wavespeed-ai/flux-kontext-max/text-to-image

FLUX.1 Kontext [max] is a text-to-image model with max performance and greatly improved prompt adherence for accurate results. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Input
If set to true, the function will wait for the result to be generated and uploaded before returning the response. It allows you to get the result directly in the response. This property is only available through the API.

Idle

The camera begins in a dimly lit interrogation room, focusing closely on the eyes of a white woman in her early 30s. It slowly pulls back to reveal her full face—platinum blonde hair slightly disheveled, her expression calm but betraying flickers of anxiety. She sits upright in a metal chair, tense wrists resting on the cold tabletop. A harsh overhead light casts sharp shadows across her face, creating stark contrast and strong shadow lines.

Votre requête coûtera $0.08 par exécution.

Pour $1 vous pouvez exécuter ce modèle environ 12 fois.

Encore une chose :

ExemplesTout voir

The camera begins in a dimly lit interrogation room, focusing closely on the eyes of a white woman in her early 30s. It slowly pulls back to reveal her full face—platinum blonde hair slightly disheveled, her expression calm but betraying flickers of anxiety. She sits upright in a metal chair, tense wrists resting on the cold tabletop. A harsh overhead light casts sharp shadows across her face, creating stark contrast and strong shadow lines.
Anime racer girl on flaming hoverboard, graffiti skyline, dynamic angle
Van Gogh-style starry night ocean, thick brushstrokes, swirling waves
Vintage bicycle leaning against brick wall with flower basket, cobblestone street, dusk blue hour ambiance
A cute lego doll with a garden background
A vibrant young superhero in a brightly colored costume, striking a dynamic pose atop a city skyline, with a lively urban landscape in the background. Bold lines, vivid colors, American comic book/Japanese anime style.
An elderly woman in elaborate Victorian attire, seated in a velvet-upholstered armchair, holding an open book with a serene expression, against the backdrop of a dimly lit classical study. Classical oil painting texture, soft Rembrandt lighting, rich and deep colors.
An 8-bit style adventurer character, holding a pixelated sword and shield, standing in a simple forest scene with blocky trees and bushes in the background. Clear pixel blocks, limited color palette, nostalgic retro game style.
A human silhouette with its head replaced by a floating geometric shape, situated in a distorted, dreamlike landscape, with broken clocks and floating islands in the sky. Non-linear perspective, bizarre juxtapositions, unexpected elements, surrealist art.
A mechanic with goggles and gear embellishments, wearing a leather apron, repairing intricate machinery in a workshop filled with steam and contraptions, with gleaming metal and pipes in the background. Retro-futurism, metallic sheen, intricately detailed mechanical parts, industrial steampunk aesthetic.

README

FLUX Kontext Max (Text-to-Image) — wavespeed-ai/flux-kontext-max/text-to-image

FLUX Kontext Max is a high-quality text-to-image model built for prompt-faithful generation with strong composition control and cinematic detail. Write a clear scene description (subject, setting, action, camera, lighting, mood), and it produces polished, production-ready images—great for story frames, concept art, marketing visuals, and realistic or stylized illustration work.

Key capabilities

  • High-fidelity text-to-image generation with strong prompt adherence
  • Cinematic composition control (framing, lens feel, lighting, atmosphere)
  • Handles complex scenes (multiple elements, detailed environments, coherent styling)
  • Consistent output quality suitable for professional creative workflows

Pricing

$0.08 per image.

Total cost = num_images × $0.08 Example: num_images = 4 → $0.32

How to use

  1. Write a prompt describing the scene and desired look.
  2. Choose an aspect ratio for your target layout (e.g., 16:9 for cinematic, 1:1 for avatars).
  3. Adjust guidance_scale if you need stricter prompt following.
  4. Set a seed for repeatable results (optional), then generate.

Parameters

  • prompt (required): The text description of what to generate
  • seed: Fixed value for reproducibility; leave empty or random for variation
  • guidance_scale: Higher values follow the prompt more strongly (too high may look rigid or overcooked)
  • aspect_ratio: Output aspect ratio (e.g., 16:9, 1:1, 9:16)
  • enable_sync_mode: Wait for generation and return results directly (API only)

Prompting guide

A reliable prompt structure:

  • Subject: who/what is in the image
  • Action: what is happening
  • Scene: where + time + atmosphere
  • Camera: framing, lens vibe, movement (if relevant)
  • Lighting: key light, contrast, color temperature
  • Style: realistic, film still, illustration, etc.

Example pattern: A [shot type] of [subject] [action] in [scene]. [Lighting + mood]. [Camera/framing]. [Style cues].

Example prompts

  • A cinematic close-up of a detective under flickering fluorescent lights, rain streaks on the window behind, shallow depth of field, tense atmosphere, film still look.
  • Wide shot of a futuristic street market at dusk, neon reflections on wet pavement, dense crowd silhouettes, volumetric light beams, high detail, cinematic composition.
  • Studio product photo of a minimalist smartwatch on a glossy black surface, softbox lighting, crisp reflections, premium advertising style, ultra clean background.

Best practices

  • Keep the first sentence simple and concrete, then add constraints and style cues.
  • Use camera and lighting language to steer composition more reliably than abstract adjectives.
  • For consistent iteration, fix seed and adjust one variable at a time (prompt wording or guidance_scale).