WaveSpeedAI Desktop is Available Now!Try it
Home/Explore/wavespeed-ai/minicpm-v/image
image-to-text

image-to-text

MiniCPM-V 4.5

wavespeed-ai/minicpm-v/image

MiniCPM-V 4.5 is the latest, most capable MiniCPM-V image model for accurate AI image understanding and analysis across visual tasks. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Hint: You can drag and drop a file or click to upload

If set to true, the function will wait for the result to be generated and uploaded before returning the response. It allows you to get the result directly in the response. This property is only available through the API.

Idle

Your request will cost $0.005 per run.

For $1 you can run this model approximately 200 times.

One more thing::

ExamplesView all

README

MiniCPM-V 4.5 AI Video Understanding

Overview

MiniCPM-V is a series of efficient end-side multimodal LLMs (MLLMs), which accept images, videos and text as inputs and deliver high-quality text outputs, including support for text-based queries, video queries, single-image queries, and multi-image queries to generate captions or responses.