WaveSpeedAI APIWavespeed AIMoondream3 Preview Detect

Moondream3 Preview Detect

Moondream3 Preview Detect

Playground

Try it on WavespeedAI!

Moondream3 Detect: Precise object bounding boxes in images for accurate computer vision localization. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Features

Moondream 3 — Object Detection

Moondream 3 Detect is a powerful vision-language model for identifying and localizing objects within images. It uses natural language input to detect specific items and returns their bounding box coordinates with high precision — ideal for visual search, annotation, and AI-assisted labeling.


✨ Key Features

  • Natural Language Object Queries Simply describe what you want to detect — e.g., “person,” “car,” “dog,” “chair.”

  • Accurate Bounding Boxes Returns precise x_min, y_min, x_max, y_max coordinates for each detected instance.

  • Multi-Object Detection Supports multiple instances of the same category in one image.

  • Fast and Lightweight Optimized for real-time or batch detection workflows with low latency.


⚙️ Example Usage

🔹 Detect Cars

{
  "image": "https://example.com/photo.jpg",
  "prompt": "car"
}

🔹 Detect People

{
  "image": "https://example.com/photo.jpg",
  "prompt": "person"
}

🔹 Detect Any Object

{
  "image": "https://example.com/photo.jpg",
  "prompt": "bicycle"
}

📦 Output Format

Bounding boxes are returned in normalized coordinates (range 0–1):

{
  "objects": [
    {
      "x_min": 0.1556,
      "x_max": 0.6881,
      "y_min": 0.2610,
      "y_max": 0.9551
    }
  ]
}

where

  • (x_min, y_min) = top-left corner
  • (x_max, y_max) = bottom-right corner

If multiple objects are detected, all boxes appear in the "objects" array.


💡 Best Practices

  • Use specific, clear object names for best accuracy.
  • For small or distant objects, higher-resolution images improve detection.
  • Supported formats: JPEG, PNG, WebP
  • Maximum image size: 10 MB

💰 Pricing

  • $0.001 per request
  • Contact WaveSpeedAI for bulk or enterprise pricing options.

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/wavespeed-ai/moondream3-preview/detect" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
    "enable_sync_mode": true
}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

ParameterTypeRequiredDefaultRangeDescription
imagestringYes-Image to analyze. Provide an HTTPS URL or upload an image file.
promptstringYes-Object to detect in the image (e.g., 'car', 'person', 'dog').
enable_sync_modebooleanNotrue-If set to true, the function will wait for the result before returning the response. This property is only available through the API.

Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
data.idstringUnique identifier for the prediction, Task Id
data.modelstringModel ID used for the prediction
data.outputsarrayArray of URLs to the generated content (empty when status is not completed)
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.has_nsfw_contentsarrayArray of boolean values indicating NSFW detection for each output
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds

Result Request Parameters

© 2025 WaveSpeedAI. All rights reserved.