WaveSpeed.ai
Home/Explore/Best Open Source Image Models/wavespeed-ai/sam3-image-rle
image-to-text

image-to-text

SAM 3 RLE

wavespeed-ai/sam3-image-rle

SAM 3 RLE is a unified foundation model for promptable image segmentation using text, points, or boxes to detect and segment objects. Returns RLE (Run-Length Encoding) encoded masks for efficient storage and processing. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Input

Hint: You can drag and drop a file or click to upload

preview
Whether to overlay the segmentation mask on the original image
If set to true, the function will wait for the result to be generated and uploaded before returning the response. It allows you to get the result directly in the response. This property is only available through the API.

Idle

{ "rle": "569516 54 571050 59 572585 63 574119 79 575654 84 577190 85 578725 88 580259 93 581794 96 583330 97 584865 100 586401 101 587937 101 589473 102 591009 102 592545 103 594081 105 595617 106 597153 106 598689 107 600225 107 601455 11 601761 107 602988 15 603297 107 604522 19 604832 109 606056 24 606367 110 607591 57 607902 111 609126 59 609437 112 610662 60 610973 113 612197 63 612509 114 613732 66 614045 115 615267 67 615581 115 616802 68 617117 115 618337 69 618653 116 619873 69 620189 116 621409 69 621725 116 622945 70 623262 115 624481 70 624799 114 626017 70 626335 114 627553 71 627870 115 629089 72 629405 116 630625 73 630940 117 632161 73 632476 117 633697 73 634012 117 635233 73 635548 117 636769 73 637084 117 638305 72 638621 116 639841 71 640158 115 641377 70 641696 113 642913 71 643232 113 644449 73 644768 113 645985 74 646305 112 647521 74 647841 112 649057 75 649376 112 650593 75 650912 111 652129 75 652448 110 653665 75 653984 109 655201 75 655520 109 656737 75 657056 108 658273 75 658593 107 659809 75 660129 107 661345 75 661665 107 662881 75 663201 106 664417 75 664737 106 665953 75 666273 105 667489 75 667809 103 669025 75 669345 102 670562 73 670881 102 672099 72 672418 100 673637 70 673954 99 675174 68 675492 95 676710 66 677029 91 678246 65 678565 87 679782 64 680102 84 681319 62 681639 81 682856 48 683176 72 684394 38 684715 63 685932 33 686253 59 687470 25 687790 55 689329 15 690867 9 896397 39 897931 43 899465 47 901000 53 902535 63 904071 65 905606 67 907142 69 908678 69 910214 70 911750 71 913286 72 914822 73 916358 74 917894 74 919430 75 920966 75 922502 75 924038 75 925574 75 927109 76 928645 76 930181 76 931716 77 933252 77 934787 78 936322 79 937858 79 939394 79 940930 79 942466 79 944002 79 945538 79 947074 79 948610 79 950146 79 951682 79 953218 79 954754 79 956291 78 957829 76 959365 76 960902 75 962438 75 963974 75 965510 75 967046 74 968582 73 970119 71 971655 69 973193 66 974730 64 976267 62 977804 50 979341 41 980878 38 982417 32" }

Your request will cost $0.005 per run.

For $1 you can run this model approximately 200 times.

One more thing::

ExamplesView all

README

SAM3 Image Segmentation RLE

SAM3 Image Segmentation RLE is an advanced image segmentation model based on Meta's Segment Anything Model 3. It returns segmentation masks in RLE (Run-Length Encoding) format — a compact, program-friendly output ideal for API integration, automated pipelines, and machine learning workflows.

Why Choose This?

  • RLE output format Returns compact Run-Length Encoded mask data instead of image files — smaller payload, faster transfer.

  • Multiple prompt types Segment objects using text prompts, point prompts, box prompts, or any combination.

  • API-optimized Designed for programmatic use, batch processing, and automated workflows.

  • COCO-compatible RLE format is directly compatible with COCO dataset tools and annotation pipelines.

  • Prompt Enhancer Built-in tool to automatically improve your text prompts for better results.

  • Ultra-affordable Just $0.005 per image for professional-quality segmentation.

Parameters

ParameterRequiredDescription
imageYesSource image to segment (upload or URL)
promptNo*Text description of the object to segment
point_promptsNo*Point coordinates to identify the target object
box_promptsNo*Bounding box coordinates to identify the target object
apply_maskNoOverlay the segmentation mask on the original image

*At least one prompt type (text, boxes, or points) must be provided.

How to Use

  1. Upload your image — drag and drop or paste a URL.
  2. Add prompts — provide at least one of the following:
    • Text prompt — describe the object to segment (e.g., "the man", "the dog").
    • Point prompts — click "+ Add Item" to add point coordinates.
    • Box prompts — click "+ Add Item" to add bounding box coordinates.
  3. Enable apply_mask (optional) — check to include mask overlay data.
  4. Run — submit and receive RLE-encoded segmentation data.

Output Format

The model returns RLE (Run-Length Encoding) data in JSON format:

{
  "rle": "146301 3 147834 11 149368 14 150903 16 ..."
}

Decoding RLE in Python

from pycocotools import mask as mask_utils

rle_data = {"counts": "146301 3 147834 11 ...", "size": [height, width]}
binary_mask = mask_utils.decode(rle_data)  # Returns numpy array

Pricing

ItemCost
Per image$0.005

Simple flat-rate pricing regardless of image size or prompt complexity.

Best Use Cases

  • ML Data Annotation — Generate segmentation masks for training datasets in COCO format.
  • Automated Pipelines — Integrate segmentation into batch processing workflows.
  • API Integration — Compact output for efficient API responses.
  • Computer Vision — Programmatic mask processing for CV applications.
  • Background Removal at Scale — Extract masks for automated image processing.

Pro Tips

  • Use this model when you need programmatic access to mask data.
  • Use SAM3 Image if you need direct image output.
  • RLE format is compatible with pycocotools for easy decoding.
  • Combine multiple prompt types for more accurate segmentation.
  • Text prompts work best for common objects with clear descriptions.

Notes

  • At least one prompt type must be provided (text, points, or boxes).
  • Output is RLE-encoded JSON, not an image file.
  • Use pycocotools or similar libraries to decode RLE data.
  • Ideal for automated and batch processing workflows.

Related Models

  • SAM3 Image — Same segmentation with direct image output.
  • Bria RMBG — Background removal model.