view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 about 1 month ago • 121
Delta Belief RL Collection Collection of the models for our paper "Intrinsic Credit Assignment for Long Horizon Interaction". • 6 items • Updated Feb 13 • 1
Compute as Teacher: Turning Inference Compute Into Reference-Free Supervision Paper • 2509.14234 • Published Sep 17, 2025 • 6
The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs Paper • 2509.09677 • Published Sep 11, 2025 • 37 • 4
The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs Paper • 2509.09677 • Published Sep 11, 2025 • 37