Moondream3 Preview Detect

Playground

Moondream3 Detect: Precise object bounding boxes in images for accurate computer vision localization. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Features

Moondream 3 — Object Detection

Moondream 3 Detect is a powerful vision-language model for identifying and localizing objects within images. It uses natural language input to detect specific items and returns their bounding box coordinates with high precision — ideal for visual search, annotation, and AI-assisted labeling.

✨ Key Features

Natural Language Object Queries Simply describe what you want to detect — e.g., “person,” “car,” “dog,” “chair.”
Accurate Bounding Boxes Returns precise x_min, y_min, x_max, y_max coordinates for each detected instance.
Multi-Object Detection Supports multiple instances of the same category in one image.
Fast and Lightweight Optimized for real-time or batch detection workflows with low latency.

⚙️ Example Usage

🔹 Detect Cars

&#123;
  "image": "https://example.com/photo.jpg",
  "prompt": "car"
&#125;

🔹 Detect People

&#123;
  "image": "https://example.com/photo.jpg",
  "prompt": "person"
&#125;

🔹 Detect Any Object

&#123;
  "image": "https://example.com/photo.jpg",
  "prompt": "bicycle"
&#125;

📦 Output Format

Bounding boxes are returned in normalized coordinates (range 0–1):

&#123;
  "objects": [
    &#123;
      "x_min": 0.1556,
      "x_max": 0.6881,
      "y_min": 0.2610,
      "y_max": 0.9551
    &#125;
  ]
&#125;

where

(x_min, y_min) = top-left corner
(x_max, y_max) = bottom-right corner

If multiple objects are detected, all boxes appear in the "objects" array.

💡 Best Practices

Use specific, clear object names for best accuracy.
For small or distant objects, higher-resolution images improve detection.
Supported formats: JPEG, PNG, WebP
Maximum image size: 10 MB

💰 Pricing

$0.001 per request
Contact WaveSpeedAI for bulk or enterprise pricing options.

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/wavespeed-ai/moondream3-preview/detect" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
    "enable_sync_mode": true
}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

Parameter	Type	Required	Default	Range	Description
image	string	Yes		-	Image to analyze. Provide an HTTPS URL or upload an image file.
prompt	string	Yes		-	Object to detect in the image (e.g., 'car', 'person', 'dog').
enable_sync_mode	boolean	No	true	-	If set to true, the function will wait for the result before returning the response. This property is only available through the API.

Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data.id	string	Unique identifier for the prediction, Task Id
data.model	string	Model ID used for the prediction
data.outputs	array	Array of URLs to the generated content (empty when status is not `completed`)
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to retrieve the prediction result
data.has_nsfw_contents	array	Array of boolean values indicating NSFW detection for each output
data.status	string	Status of the task: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Result Request Parameters

Parameter	Type	Required	Default	Description
id	string	Yes	-	Task ID

Result Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data	object	The prediction data object containing all details
data.id	string	Unique identifier for the prediction, the ID of the prediction to get
data.model	string	Model ID used for the prediction
data.outputs	string	Object containing the model output (empty when status is not completed).
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to retrieve the prediction result
data.status	string	Status of the task: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Moondream3 Preview Caption Moondream3 Preview Point