Models That Know How Evaluations Are Designed Score Safer Paper • 2605.28591 • Published 14 days ago • 8
SCOPE: Self-Play via Co-Evolving Policies for Open-Ended Tasks Paper • 2605.31433 • Published 12 days ago • 28
DenoiseRL: Bootstrapping Reasoning Models to Recover from Noisy Prefixes Paper • 2605.28421 • Published 14 days ago • 46
LLaVA-OneVision-2: Towards Next-Generation Perceptual Intelligence Paper • 2605.25979 • Published 16 days ago • 27
AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration Paper • 2605.20025 • Published 22 days ago • 186
Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality? Paper • 2605.22109 • Published 20 days ago • 169