mlx-community/rfdetr-seg-large-fp32

This model was converted to MLX format from RF-DETR (ICLR 2026) using mlx-vlm version 0.4.3.

RF-DETR Seg-Large produces higher quality instance masks than Seg-Small thanks to higher input resolution (504px vs 384px), more decoder layers (5 vs 4), and more segmentation blocks (5 vs 4).

Use with mlx

pip install -U mlx-vlm
from pathlib import Path
from PIL import Image
from mlx_vlm.utils import load_model
from mlx_vlm.models.rfdetr.processing_rfdetr import RFDETRProcessor
from mlx_vlm.models.rfdetr.generate import RFDETRPredictor

model = load_model(Path("mlx-community/rfdetr-seg-large-fp32"))
processor = RFDETRProcessor.from_pretrained("mlx-community/rfdetr-seg-large-fp32")
predictor = RFDETRPredictor(model, processor, score_threshold=0.3, nms_threshold=0.5)

result = predictor.predict(Image.open("image.jpg"))
# result.boxes   - (N, 4) xyxy pixel coordinates
# result.scores  - (N,) confidence scores
# result.masks   - (N, H, W) binary instance masks
# result.class_names - list of class names

CLI

# Image segmentation
python -m mlx_vlm.models.rfdetr.generate --task segment --image photo.jpg --model mlx-community/rfdetr-seg-large-fp32

# Video processing
python -m mlx_vlm.models.rfdetr.generate --task track --video input.mp4 --model mlx-community/rfdetr-seg-large-fp32

# Realtime camera
python -m mlx_vlm.models.rfdetr.generate --task realtime --model mlx-community/rfdetr-seg-large-fp32

# Blur background
python -m mlx_vlm.models.rfdetr.generate --task segment --image photo.jpg --model mlx-community/rfdetr-seg-large-fp32 --annotator blur+bg

Model Details

Architecture DINOv2-small backbone + C2f projector + 5-layer Deformable DETR decoder + 5-block Segmentation head
Task Object detection + instance segmentation (COCO 80 classes)
Parameters ~36M
Input resolution 504x504
Mask resolution 126x126
Dtype float32
Inference (M4 Max) 89ms per image (11 FPS)
Peak memory ~1.2 GB

Comparison

Model Resolution Mask quality Inference Memory
rfdetr-seg-small-fp32 384px (96x96 masks) Good ~26ms (38 FPS) 683 MB
rfdetr-seg-large-fp32 504px (126x126 masks) Better ~89ms (11 FPS) 1.2 GB

Reference

Downloads last month
6
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for mlx-community/rfdetr-seg-large-fp32