
text-to-image
Idle

このリクエストには1回あたりで$0.012の費用がかかります。
$1でおよそ83回実行できます。
もうひとつお知らせ::





Z-Image Turbo ControlNet is a powerful image generation model that gives you precise control over composition through structural guidance signals. Unlike standard text-to-image models that interpret prompts freely, ControlNet lets you define the exact structure, edges, depth, or pose of your output by analyzing a reference image.
Think of it as a blueprint system: you provide a reference image, choose how to analyze it (depth map, edge detection, or pose estimation), and the model generates a new image that follows that structural blueprint while matching your text prompt.
Precise composition control Define exact layouts, poses, and spatial relationships instead of hoping the model interprets your prompt correctly.
Multiple control modes Choose depth mapping for 3D structure, canny edge detection for outlines, pose estimation for human figures, or none for standard generation.
Reference-guided generation Use existing images as structural templates while completely changing style, content, and appearance.
Flexible strength control Adjust how strictly the model follows the control signal — from loose inspiration to exact replication.
Fast and affordable Turbo-optimized for quick generation at just $0.05 per image.
The mode parameter determines how the model analyzes your reference image:
| Mode | What It Extracts | Best For |
|---|---|---|
| depth | 3D depth information (near/far relationships) | Architectural scenes, landscapes, maintaining spatial depth |
| canny | Edge outlines and contours | Line art, sketches, preserving shapes and boundaries |
| pose | Human body keypoints and skeleton | Character poses, figure drawing, action scenes |
| none | No control signal (standard generation) | When you don't need structural guidance |
| Parameter | Required | Description |
|---|---|---|
| prompt | Yes | Text description of the image you want to generate |
| image | Yes | Reference image URL for ControlNet to analyze |
| mode | No | Control mode: depth, canny, pose, or none (default: depth) |
| size | No | Output size in pixels as widthheight (default: 10241024) |
| strength | No | Control signal strength 0-1 (default: 0.6) |
| seed | No | Random seed for reproducibility (-1 for random) |
| output_format | No | Output format: jpeg, png, or webp (default: jpeg) |
$0.012 Per image. Simple flat-rate pricing regardless of control mode or image size.