ImagineBench: Evaluating Reinforcement Learning with Large Language Model Rollouts Paper • 2505.10010 • Published May 15, 2025 • 3
RLVR Training of LLMs Does Not Improve Thinking Ability for General QA: Evaluation Method and a Simple Solution Paper • 2603.20799 • Published Mar 21 • 1
Natural Language-conditioned Reinforcement Learning with Inside-out Task Language Development and Translation Paper • 2302.09368 • Published Feb 18, 2023 • 1
EDCO: Dynamic Curriculum Orchestration for Domain-specific Large Language Model Fine-tuning Paper • 2601.03725 • Published Jan 7 • 1
Reinforcement Learning with Promising Tokens for Large Language Models Paper • 2602.03195 • Published Feb 3 • 1
Language Model Self-improvement by Reinforcement Learning Contemplation Paper • 2305.14483 • Published May 23, 2023 • 1