Eval datasets for CoT Oracle: authority bias, sycophancy, hint following, decorative reasoning, and more.
mats
non-profit
AI & ML interests
None defined yet.
Recent Activity
View all activity
models 0
None public yet
datasets 15
cds-jb/synthweb-qwen3.5-9b-multiscale-inference
Viewer • Updated • 295k • 72
cds-jb/synthweb-qwen3-8b-multiscale-inference
Viewer • Updated • 296k • 84
cds-jb/AOBenchPlus
Viewer • Updated • 385 • 40
cds-jb/cot-qwen3-8b
Viewer • Updated • 202k • 38
cds-jb/cot-qwen35-9b
Viewer • Updated • 202k • 31
cds-jb/cot-oracle-corpus-v5-qwen35-9b
Viewer • Updated • 40.5k • 60
cds-jb/synthweb-qwen3.5-9b
Viewer • Updated • 982k • 28
cds-jb/synthweb-qwen3-8b
Viewer • Updated • 1.12M • 38
cds-jb/fineweb-oracle-convqa-chunked
Viewer • Updated • 29k • 108
cds-jb/cot-oracle-convqa-chunked-gemini
Viewer • Updated • 29.6k • 65