WaveSpeedAI APIFlux Controlnet Union Pro 2.0

Flux Controlnet Union Pro 2.0

FLUX.1 ControlNet Union Pro 2.0 is a high-performance endpoint for the FLUX.1 model with advanced ControlNet capabilities, supporting multiple control modes including Canny, Depth, Pose, and more for precise image generation and control."

Features

FLUX.1-dev-ControlNet-Union-Pro-2.0 is an enhanced unified ControlNet for FLUX.1-dev model released by Shakker Labs. This version offers significant improvements over the previous Pro version with better performance and control capabilities.

Key Improvements (vs Pro 1.0)

  • Smaller Model Size: Removed mode embedding for reduced memory footprint
  • Enhanced Control: Improved Canny and Pose control with better aesthetics
  • New Soft Edge Support: Added AnylineDetector-based soft edge control
  • Streamlined Architecture: Simplified from 12 modes to 5 optimized modes

Technical Specifications

  • Architecture: 6 double blocks + 0 single blocks (mode embedding removed)
  • Training: 300k steps on 20M high-quality general and human images
  • Resolution: 512x512 training resolution
  • Precision: BFloat16
  • Batch Size: 128
  • Learning Rate: 2e-5
  • Guidance Range: [1, 7] uniformly sampled
  • Text Drop Ratio: 0.20

Supported Control Modes

1. Canny Edge Detection

  • Detector: cv2.Canny
  • Recommended Settings: conditioning_scale=0.7, guidance_end=0.8
  • Use Case: Precise edge-based control for structural guidance

2. Soft Edge Detection

  • Detector: AnylineDetector
  • Recommended Settings: conditioning_scale=0.7, guidance_end=0.8
  • Use Case: Softer, more natural edge detection for artistic control

3. Depth Control

  • Detector: depth-anything
  • Recommended Settings: conditioning_scale=0.8, guidance_end=0.8
  • Use Case: 3D depth-aware image generation

4. Human Pose Control

  • Detector: DWPose
  • Recommended Settings: conditioning_scale=0.9, guidance_end=0.65
  • Use Case: Precise human pose and body structure control

5. Grayscale Control

  • Detector: cv2.cvtColor
  • Recommended Settings: conditioning_scale=0.9, guidance_end=0.8
  • Use Case: Grayscale-to-color generation with structural preservation

Usage Guidelines

  • Detailed Prompts: Use detailed text prompts for better stability
  • Multi-Condition Support: Can be combined with other ControlNets
  • Parameter Tuning: Adjust conditioning_scale and control_guidance_end for optimal results
  • Quality Input: Higher quality control images produce better results

Performance Optimizations

Our WavespeedAI implementation includes:

  • Memory Optimization: Efficient GPU memory management for the streamlined architecture
  • Pipeline Acceleration: Optimized inference pipeline leveraging the simplified model structure
  • Dynamic Batching: Intelligent batching for improved throughput
  • Model Compilation: XeLerate-powered model compilation for faster inference

Professional Applications

  • Architectural Visualization: Depth and edge control for building renders
  • Character Design: Pose control for consistent character positioning
  • Art Direction: Soft edge control for concept development
  • Photography Enhancement: Grayscale colorization and structure preservation
  • Digital Art Creation: Combined control modes for artistic workflows

Limitations

  • Control Quality Dependency: Output quality depends on control image precision
  • Prompt Sensitivity: Results are influenced by both control inputs and text prompts
  • Removed Modes: No longer supports Tile mode (removed in Pro 2.0)
  • Memory Requirements: Still requires significant GPU memory for high-resolution outputs

This model represents the state-of-the-art in unified ControlNet technology, offering professional-grade control with improved efficiency and quality.

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/wavespeed-ai/flux-controlnet-union-pro2.0" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
    "prompt": "A robot is giving a speech.",
    "control_image": "https://d1q70pf5vjeyhc.wavespeed.ai/media/images/1751204120011542164_h8qnjgc9.png",
    "size": "1024*1024",
    "num_inference_steps": 28,
    "guidance_scale": 3.5,
    "controlnet_conditioning_scale": 0.7,
    "control_guidance_end": 0.8,
    "seed": 0,
    "num_images": 1,
    "enable_safety_checker": true,
    "enable_base64_output": false
}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

ParameterTypeRequiredDefaultRangeDescription
promptstringYes-The prompt to generate an image from.
control_imagestringNohttps://d1q70pf5vjeyhc.wavespeed.ai/media/images/1751204120011542164_h8qnjgc9.png-The URL of the control image for ControlNet guidance.
sizestringNo1024*1024256 ~ 1536 per dimensionThe size of the generated image.
num_inference_stepsintegerNo281 ~ 50The number of inference steps to perform.
guidance_scalenumberNo3.50 ~ 20 The CFG (Classifier Free Guidance) scale is a measure of how close you want the model to stick to your prompt when looking for a related image to show you.
controlnet_conditioning_scalenumberNo0.70 ~ 2The conditioning scale for ControlNet. Higher values make the output follow the control image more closely.
control_guidance_endnumberNo0.80 ~ 1The fraction of total steps at which ControlNet guidance ends.
seedintegerNo--1 ~ 2147483647 The same seed and the same prompt given to the same version of the model will output the same image every time.
num_imagesintegerNo11 ~ 4The number of images to generate.
enable_safety_checkerbooleanNotrue-If set to true, the safety checker will be enabled.
enable_base64_outputbooleanNofalse-Enable base64 encoded output.

Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
data.idstringUnique identifier for the prediction, Task Id
data.modelstringModel ID used for the prediction
data.outputsarrayArray of URLs to the generated content (empty when status is not completed)
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.has_nsfw_contentsarrayArray of boolean values indicating NSFW detection for each output
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds

Result Query Parameters

Result Request Parameters

ParameterTypeRequiredDefaultDescription
idstringYes-Task ID

Result Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
dataobjectThe prediction data object containing all details
data.idstringUnique identifier for the prediction
data.modelstringModel ID used for the prediction
data.outputsarrayArray of URLs to the generated content (empty when status is not completed)
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.has_nsfw_contentsarrayArray of boolean values indicating NSFW detection for each output
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds
© 2025 WaveSpeedAI. All rights reserved.