1 11 3

Woojung Song

Opusdei

AI & ML interests

None yet

Recent Activity

upvoted a paper about 6 hours ago

Agents' Last Exam

authored a paper 1 day ago

Human Psychometric Questionnaires Mischaracterize LLM Behavior

liked a model 1 day ago

Value4AI/ValueLlama-3-8B

View all activity

Organizations

None yet

upvoted a paper about 6 hours ago

Agents' Last Exam

Paper • 2606.05405 • Published 8 days ago • 163

authored a paper 1 day ago

Human Psychometric Questionnaires Mischaracterize LLM Behavior

Paper • 2509.10078 • Published 13 days ago • 31

liked a model 1 day ago

Value4AI/ValueLlama-3-8B

Text Generation • 8B • Updated Sep 19, 2024 • 101 • • 6

upvoted a paper 1 day ago

SWE-Explore: Benchmarking How Coding Agents Explore Repositories

Paper • 2606.07297 • Published 6 days ago • 105

upvoted a paper 2 days ago

Human Psychometric Questionnaires Mischaracterize LLM Behavior

Paper • 2509.10078 • Published 13 days ago • 31

submitted a paper to Daily Papers 2 days ago

Human Psychometric Questionnaires Mischaracterize LLM Behavior

Paper • 2509.10078 • Published 13 days ago • 31

upvoted a paper 2 days ago

SoCRATES: Towards Reliable Automated Evaluation of Proactive LLM Mediation across Domains and Socio-cognitive Variations

Paper • 2606.05563 • Published 7 days ago • 47

upvoted a paper 4 days ago

Cosmos 3: Omnimodal World Models for Physical AI

Paper • 2606.02800 • Published 10 days ago • 111

authored a paper 5 days ago

ArcANE: Do Role-Playing Language Agents Stay in Character at the Right Time?

Paper • 2606.05553 • Published 7 days ago • 47

upvoted 2 papers 6 days ago

RobotValues: Evaluating Household Robots When Human Values Conflict

Paper • 2606.03312 • Published 9 days ago • 25

ArcANE: Do Role-Playing Language Agents Stay in Character at the Right Time?

Paper • 2606.05553 • Published 7 days ago • 47

liked a dataset 22 days ago

Value4AI/Agent-ValueBench

Viewer • Updated 28 days ago • 9.06k • 3.3k • 3

upvoted a paper 24 days ago

Where Should Diffusion Enter a Language Model? Geometry-Guided Hidden-State Replacement

Paper • 2605.14368 • Published 28 days ago • 16

upvoted a paper 26 days ago

Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values

Paper • 2605.10365 • Published about 1 month ago • 9

upvoted 2 papers 27 days ago

KL for a KL: On-Policy Distillation with Control Variate Baseline

Paper • 2605.07865 • Published May 8 • 22

Your Language Model is Its Own Critic: Reinforcement Learning with Value Estimation from Actor's Internal States

Paper • 2605.07579 • Published May 8 • 18

updated a collection almost 2 years ago

forfun

Collection

1 item • Updated Aug 7, 2024

liked a model almost 2 years ago

LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct

Text Generation • 8B • Updated Aug 8, 2024 • 41.7k • 420

Woojung Song

AI & ML interests

Recent Activity

Organizations

Opusdei's activity