Home/Explore/wavespeed-ai/minicpm-v/image

image-to-text

wavespeed-ai/minicpm-v/image

MiniCPM-V 4.5 is the latest and most capable model in the MiniCPM-V series.

Hint: You can drag and drop a file or click to upload

If set to true, the function will wait for the result to be generated and uploaded before returning the response. It allows you to get the result directly in the response. This property is only available through the API.

Idle

Your request will cost $0.005 per run.

For $1 you can run this model approximately 200 times.

ExamplesView all

README

MiniCPM-V 4.5 AI Video Understanding

Overview

MiniCPM-V is a series of efficient end-side multimodal LLMs (MLLMs), which accept images, videos and text as inputs and deliver high-quality text outputs, including support for text-based queries, video queries, single-image queries, and multi-image queries to generate captions or responses.