text-to-image

google/imagen3

Google's highest quality text-to-image model, capable of generating images with detail, rich lighting and beauty

NEW
COMMERCIAL USE
PARTNER
Doc
If enabled, the output will be encoded into a BASE64 string instead of a URL. This property is only available through the API.

Idle

A young woman tying her hair in front of a foggy bathroom mirror, toothbrush in mouth, cozy pajamas, bathroom shelf cluttered with daily items

Your request will cost $0.04 per run.

For $1 you can run this model approximately 25 times.

ExamplesView all

A group of friends eating hotpot at home, smiling, chatting, with realistic tabletop details
A woman drinking coffee alone in a corner café, reading a book, afternoon light through large windows
A grandmother knitting on an armchair, worn books stacked beside her, sleepy dog at her feet, old wooden clock on the wall, peaceful ambiance
Two roommates sitting on a cozy couch, laughing while watching a movie, popcorn bowl between them, string lights in the background
A man brewing coffee in a small kitchen, sunlight filtering through patterned curtains, steam rising, tiled countertop, houseplants on the windowsill
A young woman tying her hair in front of a foggy bathroom mirror, toothbrush in mouth, cozy pajamas, bathroom shelf cluttered with daily items

README

Imagen 3

Imagen 3 is DeepMind’s latest text-to-image generative model, focusing on high-quality image generation with improved detail, lighting, and reduced artifacts.

Core Capabilities

  • Enhanced prompt understanding for complex image generation tasks

  • Improved text rendering for applications like presentations and typography

  • Support for diverse artistic styles from photorealism to animation

  • Better handling of lighting, textures, and fine details

  • Natural language prompt processing without requiring complex prompt engineering

Technical Improvements

Image Quality

  • Enhanced color balance and vibrancy

  • Improved texture rendering

  • Better detail preservation in complex scenes

  • Reduced artifact generation

  • More accurate style reproduction across different artistic genres

Prompt Processing

  • Support for longer, more detailed prompts

  • Better understanding of camera angles and composition requirements

  • Improved handling of specific style requests

  • Enhanced text rendering capabilities

Benchmarks

Performance metrics based on human evaluation using GenAI-Bench:

  • Highest score for visual quality among compared models

  • High accuracy in prompt response adherence

  • Strong performance in overall preference benchmarks

Detailed benchmark methodology and results are available in Appendix D of the technical report.

Security Features

  • Built-in content filtering system

  • Dataset filtering to minimize harmful content

  • SynthID watermarking integration for image identification

  • Extensive red teaming and evaluations for: Fairness, Bias, Content safety

Technical Documentation

For detailed technical specifications and methodology, refer to the full technical report.