LongVideo-R1: Smart Navigation for Low-cost Long Video Understanding Paper • 2602.20913 • Published Feb 24 • 11
Build error Agents Featured 107 PhotoDoodle Image Edit GPU 👀 107 Edit photos with AI using custom prompts and styles
VG-Refiner: Towards Tool-Refined Referring Grounded Reasoning via Agentic Reinforcement Learning Paper • 2512.06373 • Published Dec 6, 2025 • 9
Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search Paper • 2509.07969 • Published Sep 9, 2025 • 60
DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding Paper • 2411.14347 • Published Nov 21, 2024 • 17
Running on Zero Agents Featured 844 Florence 2 📉 844 Generate captions, detections, and segmentations for any image
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection Paper • 2405.10300 • Published May 16, 2024 • 31 • 2
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection Paper • 2405.10300 • Published May 16, 2024 • 31