vision-language
Idle
Your request will cost $0.005 per run.
For $1 you can run this model approximately 200 times.
Moondream3 Caption is a specialized vision-language model for generating descriptive captions for images.
{
"image": "https://example.com/photo.jpg",
"length": "short"
}
{
"image": "https://example.com/photo.jpg",
"length": "normal"
}
{
"image": "https://example.com/photo.jpg",
"length": "long"
}
Fixed price per request. Contact WaveSpeed for volume discounts.