Masking Stale Observations Helps Search Agents -- Until It Doesn't: A Regime Map and Its Mechanism Paper • 2606.00408 • Published May 29 • 65
MLS-Bench: A Holistic and Rigorous Assessment of AI Systems on Building Better AI Paper • 2605.08678 • Published May 9 • 9
Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning Paper • 2509.03646 • Published Sep 3, 2025 • 33
Reverse-Engineered Reasoning for Open-Ended Generation Paper • 2509.06160 • Published Sep 7, 2025 • 151