DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards Paper • 2605.21467 • Published 16 days ago • 204
Improving Vision-language Models with Perception-centric Process Reward Models Paper • 2604.24583 • Published Apr 27 • 3