Moondream3 Preview Query

Playground

Moondream3 Query answers natural language questions on images with visual Q&A and optional chain of thought for detailed explanations. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Features

Moondream 3 — Visual Question Answering (VQA)

Moondream 3 Query is an advanced vision-language model designed to understand images and answer natural-language questions about them. It combines fast inference, accurate scene understanding, and optional reasoning for visual explanation — ideal for analysis, education, and creative applications.

✨ Key Features

Visual Q&A Ask questions about any image — people, objects, actions, or scenes — and receive natural language answers.
Chain-of-Thought Reasoning Enable reasoning mode to let the model explain how it reached its conclusion, useful for analysis and debugging.
Accurate Visual Understanding Trained on diverse, high-quality image-text datasets for reliable recognition of complex visual contexts.
Fast and Lightweight Optimized for low latency and efficient inference while maintaining strong reasoning performance.

⚙️ Example Usage

🔹 Basic Query

&#123;
  "image": "https://example.com/photo.jpg",
  "prompt": "What is the person in the image doing?"
&#125;

🔹 Query with Reasoning

&#123;
  "image": "https://example.com/photo.jpg",
  "prompt": "What emotions are visible in this scene?",
  "reasoning": true
&#125;

💡 Best Practices

Ask clear and specific questions for higher accuracy.
Enable reasoning mode for tasks that require multi-step or contextual analysis.
Supported image formats: JPEG, PNG, WebP
Maximum image size: 10 MB

💰 Pricing

$0.005 per request
Volume discounts available — please contact WaveSpeedAI for enterprise or batch pricing.

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/wavespeed-ai/moondream3-preview/query" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
    "reasoning": false,
    "enable_sync_mode": true
}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

Parameter	Type	Required	Default	Range	Description
image	string	Yes		-	Image to be analyzed. Provide an HTTPS URL or upload an image file.
prompt	string	Yes		-	Your question about the image.
reasoning	boolean	No	false	-	Enable chain-of-thought reasoning to get more detailed explanations.
enable_sync_mode	boolean	No	true	-	If set to true, the function will wait for the result before returning the response. This property is only available through the API.

Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data.id	string	Unique identifier for the prediction, Task Id
data.model	string	Model ID used for the prediction
data.outputs	array	Array of URLs to the generated content (empty when status is not `completed`)
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to retrieve the prediction result
data.has_nsfw_contents	array	Array of boolean values indicating NSFW detection for each output
data.status	string	Status of the task: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Result Request Parameters

Parameter	Type	Required	Default	Description
id	string	Yes	-	Task ID

Result Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data	object	The prediction data object containing all details
data.id	string	Unique identifier for the prediction, the ID of the prediction to get
data.model	string	Model ID used for the prediction
data.outputs	string	Object containing the model output (empty when status is not completed).
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to retrieve the prediction result
data.status	string	Status of the task: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Moondream3 Preview Point Multitalk