WaveSpeedAI APIAlibabaAlibaba Qwen Image Translate

Alibaba Qwen Image Translate

Alibaba Qwen Image Translate

Playground

Try it on WavespeedAI!

Advanced AI model for image understanding, OCR, and text translation with Alibaba Qwen Vision.

Features

Alibaba Qwen Vision - Image Understanding & Translation

Alibaba Qwen Vision is a multimodal AI model that integrates OCR (optical character recognition) and multilingual translation. Built on Alibaba Cloud’s DashScope, it can extract text from images and translate it into multiple languages quickly and accurately.

Why it looks great

  • Accurate OCR: recognizes printed and handwritten text from images.
  • Multi-language support: detect and translate across English, Chinese, Japanese, Korean, French, German, Spanish, Russian, Arabic, and more.
  • Customizable translation: define terminologies and filter sensitive words for domain-specific use cases.
  • Document understanding: works with forms, receipts, signage, and scanned documents.
  • Real-time performance: fast turnaround for practical scenarios like menus, signs, and learning materials.

Limits and Performance

  • Supported formats: PNG, JPEG, WEBP
  • Processing speed: ~3–6 seconds per image
  • Segmentation: automatic text region detection (can be disabled via skip_image_segment)

Pricing

Task TypeCost per image
OCR / Translation$0.01

How to Use

  1. Upload the image containing text.
  2. Select source_lang (e.g., auto, en, zh, ja, ko, fr, de, es, ru, ar).
  3. Select target_lang for translation.
  4. (Optional) Add sensitives to filter sensitive words.
  5. (Optional) Add terminologies to ensure consistent translations for key terms.
  6. (Optional) Check skip_image_segment if you don’t want automatic segmentation.
  7. Run the job and download/view the results.

Pro tips for best quality

  • Upload high-resolution images with clear text for better OCR accuracy.
  • Use auto for source_lang when handling mixed or unknown languages.
  • Add terminologies for industry-specific vocabulary (e.g., finance, medicine).
  • Filter sensitive words via sensitives for safer outputs.
  • Keep segmentation enabled for documents with multiple text regions.

Notes

  • Best for document digitization, translation of signage/menus, multilingual education, and accessibility tools.
  • If you did not upload the image locally, please ensure that the image URL is accessible! A successfully accessible image will display a preview in the interface.

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/alibaba/qwen-image/translate" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
    "source_lang": "auto",
    "target_lang": "zh",
    "sensitives": [],
    "terminologies": [],
    "skip_image_segment": false
}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

ParameterTypeRequiredDefaultRangeDescription
imagestringYes-The image to process for translation
source_langstringNoautoauto, en, zh, ja, ko, fr, de, es, ru, arSource language code (auto for auto-detection)
target_langstringYeszhen, zh, ja, ko, fr, de, es, ru, arTarget language code for translation
domain_hintstringNo--If you want the translation style to be more in line with the characteristics of a certain field, you can use English to describe the usage scenario, translation style and other field requirements. In order to ensure the translation effect, it is recommended that the length does not exceed 200 English words.
sensitivesarrayNo-Array of sensitive words to filter
terminologiesarrayNo-Array of terminoogies to use for translation
skip_image_segmentbooleanNofalse-Whether to skip image segmentation

Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
data.idstringUnique identifier for the prediction, Task Id
data.modelstringModel ID used for the prediction
data.outputsarrayArray of URLs to the generated content (empty when status is not completed)
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.has_nsfw_contentsarrayArray of boolean values indicating NSFW detection for each output
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds

Result Request Parameters

© 2025 WaveSpeedAI. All rights reserved.