Large Language Models
EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL
Efficient RLVR Training via Weighted Mutual Information Data Selection