VitaBench 2.0: Evaluating Personalized and Proactive Agents in Long-Term User Interactions Paper • 2605.27141 • Published 3 days ago • 11
EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling Paper • 2310.04691 • Published Oct 7, 2023 • 3
WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation Paper • 2605.25874 • Published 4 days ago • 96
LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding Paper • 2605.27365 • Published 3 days ago • 112
WBench Collection WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation • 3 items • Updated 1 day ago • 1
WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation Paper • 2605.25874 • Published 4 days ago • 96
WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation Paper • 2605.25874 • Published 4 days ago • 96
WBench Collection WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation • 3 items • Updated 1 day ago • 1
WBench Collection WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation • 3 items • Updated 1 day ago • 1