Home/Explore/wavespeed-ai/moondream3-preview/query

vision-language

wavespeed-ai/moondream3-preview/query

Ask natural language questions about images and get intelligent answers. Moondream3 Query supports visual Q&A with optional chain-of-thought reasoning for detailed explanations.

Hint: You can drag and drop a file or click to upload

Enable chain-of-thought reasoning to get more detailed explanations.
If set to true, the function will wait for the result before returning the response. This property is only available through the API.

Idle

Your request will cost $0.005 per run.

For $1 you can run this model approximately 200 times.

README

Moondream3 Query - Visual Question Answering

Moondream3 Query is a specialized vision-language model for asking natural language questions about images and receiving intelligent answers.

Features

  • Visual Q&A: Ask any question about an image in natural language
  • Chain-of-Thought Reasoning: Enable detailed explanations of the model's reasoning process
  • Fast Response: Optimized for quick inference
  • Accurate Understanding: Trained on diverse visual datasets

Example Usage

Basic Query

{
  "image": "https://example.com/photo.jpg",
  "prompt": "What is the person in the image doing?"
}

Query with Reasoning

{
  "image": "https://example.com/photo.jpg",
  "prompt": "What emotions are visible in this scene?",
  "reasoning": true
}

Best Practices

  • Use clear, specific questions for better results
  • Enable reasoning for complex visual analysis
  • Supported formats: JPEG, PNG, WebP
  • Max image size: 10MB

Pricing

Fixed price per request. Contact WaveSpeed for volume discounts.