X Ai Grok 2 Image
Playground
Try it on WavespeedAI!Grok 2 Image is xAI’s latest image generation model that turns simple text prompts into sharp, photorealistic visuals in seconds. From product shots to social posts and concept art, it follows your instructions closely so you can go from idea to production-ready image with just one prompt. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.
Features
Grok 2 Image
What is Grok 2 Image?
Grok 2 Image turns a natural-language text prompt into vivid, realistic images.
It’s xAI’s flagship image generation model, tuned for marketing creatives, social posts, product visuals, concept art, and more.
In the API, you use the grok-2-image. A single request can generate multiple images, making it easy to explore variations on a single idea.
Why it looks great
-
Photorealistic, high-fidelity imagery
Trained to produce detailed textures, convincing lighting, and sharp compositions that work well for ads, hero images, and product renders. -
Strong prompt following
Optimized for following descriptive prompts closely, capturing objects, layouts, and styles specified in your text while minimizing “prompt drift.” -
Flexible visual styles
Handles realistic photography, digital illustration, stylized artwork, and concept sketches, making it useful for storyboards, thumbnails, and creative exploration. -
Multi-image generation in one shot
A single request can generate up to 10 JPG images, so you can explore multiple creative directions from one prompt. -
Competitive per-image pricing
Images are billed per output image, keeping costs predictable for batch runs and A/B creative testing. -
Prompt refinement under the hood
Before reaching the image model, your text prompt can be lightly revised by a chat model to improve clarity, often leading to more accurate results without extra work on your side.
Pricing
-
Billing is based on the number of images generated.
-
Each image will cost $0.07.
How to Use
-
Write your prompt
- Describe the subject, scene, style, and mood, for example:
- “ultra-wide shot of a neon city at night, rainy streets, cinematic”
- “product photo of wireless earbuds on a marble surface, soft studio lighting”
- Describe the subject, scene, style, and mood, for example:
-
Send the generation job
- Call the image API with model: “grok-2-image” (or grok-2-image-1212) and your prompt.
- Optionally specify how many variations to generate (up to 10 images per request).
-
Download or display the results
- The API returns JPG images via URLs or encoded data, which you can save, display in an app, or feed into downstream editing/compositing tools.
Note
-
Output format:
Images are returned in JPG format. -
Per-job limits:
- Up to 10 images per request
- Additional throughput limits depend on your account/plan.
-
Prompt tips:
- Be concrete about objects, layout, and style (e.g., “centered product on plain background”).
- Avoid contradictory instructions in a single prompt.
- Iterate: start simple, then gradually add details once you like the base composition.
More Image Generation Model Choices
-
Nano Banana Pro High-quality text-to-image generation from Google, suitable for product shots, concept art, and creative visuals.
-
Seedream v4.5 A versatile image generation model from ByteDance, tuned for detailed scenes, characters, and stylized compositions.
-
Kling Image O1 A flagship image model from Kwaivgi/Kuaishou’s Kling series, focused on sharp, high-fidelity visuals and strong prompt adherence.
-
Qwen Image An Alibaba Qwen-based generator hosted by WaveSpeedAI, delivering robust semantic understanding and reliable text-to-image rendering across diverse styles.
Reference
- xAI Image Generations Guide – official documentation for using grok-2-image via the xAI API.
Authentication
For authentication details, please refer to the Authentication Guide.
API Endpoints
Submit Task & Query Result
# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/x-ai/grok-2-image" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
"num_images": 1,
"enable_sync_mode": false,
"enable_base64_output": false
}'
# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"
Parameters
Task Submission Parameters
Request Parameters
| Parameter | Type | Required | Default | Range | Description |
|---|---|---|---|---|---|
| prompt | string | Yes | - | The positive prompt for the generation. | |
| num_images | integer | No | 1 | 1 ~ 10 | Number of images to be generated. |
| enable_sync_mode | boolean | No | false | - | If set to true, the function will wait for the result to be generated and uploaded before returning the response. It allows you to get the result directly in the response. This property is only available through the API. |
| enable_base64_output | boolean | No | false | - | If enabled, the output will be encoded into a BASE64 string instead of a URL. This property is only available through the API. |
Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data.id | string | Unique identifier for the prediction, Task Id |
| data.model | string | Model ID used for the prediction |
| data.outputs | array | Array of URLs to the generated content (empty when status is not completed) |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.has_nsfw_contents | array | Array of boolean values indicating NSFW detection for each output |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |