🏗️ Building on HF

Sergio Paniego PRO

sergiopaniego

huggingface

·

https://sergiopaniego.github.io/

AI & ML interests

None yet

Recent Activity

updated a dataset 22 minutes ago

agents-course/final-certificates

updated a dataset 22 minutes ago

agents-course/course-certificates-of-excellence

updated a dataset 1 day ago

huggingface-projects/Deep-RL-Course-Certification

View all activity

Organizations

buckets 83

sergiopaniego/claude-code-static-0b7436-bucket

sergiopaniego/claude-code-local-bucket

sergiopaniego/claude-code-hf-sandbox-static-873887-bucket

sergiopaniego/claude-code-hf-sandbox-bucket

sergiopaniego/claude-code-hf-sandbox-static-62ec9c-bucket

sergiopaniego/pelican-svg-grpo-curves-bucket

View 83 buckets

Posts 106

Post

2502

Simon Willison (@simonw ) has asked every new model to draw a pelican riding a bicycle for some time now

you look at the drawing and you know. but there is no number, so nothing can train against it, no?

I turned this idea into an rl env in OpenEnv. now, you can eval any model against it, and train against it with TRL

read the details!🤓

https://huggingface.co/blog/sergiopaniego/pelican-env-openenv

Articles 27

Article

1

Can you train a model on Simon Willison's deeply unscientific pelican benchmark?

View all Articles

Collections 10

View 10 collections

spaces 186

VLM Object Understanding

Explore object detection, visual grounding, keypoint Detecti

Qwen2-VL-7B

Ask questions about charts in images

SmolVLM-trl-dpo-rlaif-v

Generate text from an image and question

SmolVLM-trl-sft-ChartQA

Ask questions about charts in images

Claude Code Static 0b7436

View and explore your data with an interactive dashboard

Claude Code Local

Show tracking data visualization

View 186 Spaces

models 149

sergiopaniego/Qwen3-4B-claude-code-local-grpo

Text Generation • 4B • Updated 1 day ago

sergiopaniego/Qwen3-4B-claude-code-deepcoder-grpo

Text Generation • 4B • Updated 1 day ago

sergiopaniego/pelican-svg-grpo-Qwen3-1.7B-judged

Text Generation • 2B • Updated 4 days ago • 54

sergiopaniego/pelican-svg-grpo-Qwen3-1.7B

Text Generation • 2B • Updated 4 days ago • 37

sergiopaniego/Qwen3-8B-opencode-deepcoder-grpo

Text Generation • 8B • Updated 5 days ago • 36

sergiopaniego/grpo-youtube-livestream-3-scripts

Reinforcement Learning • Updated 6 days ago • 1

sergiopaniego/qwen3-0.6b-mbpp-grpo-k2

Text Generation • 0.6B • Updated 6 days ago • 69

sergiopaniego/qwen3-0.6b-mbpp-grpo-k16

Text Generation • 0.6B • Updated 6 days ago • 79

sergiopaniego/qwen3-0.6b-mbpp-grpo-k8

Text Generation • 0.6B • Updated 6 days ago • 74

sergiopaniego/Qwen3.5-4B-sdpo-math-hints

Updated 23 days ago • 1

View 149 models

datasets 15

sergiopaniego/pelican-svg-drawings

Viewer • Updated 4 days ago • 169 • 91 • 1

sergiopaniego/math-sdpo-hints-plain

Viewer • Updated 23 days ago • 600 • 50

sergiopaniego/math-sdpo-hints

Viewer • Updated 23 days ago • 600 • 55

sergiopaniego/gsm8k-sdpo-plain

Viewer • Updated 23 days ago • 700 • 52

sergiopaniego/gsm8k-sdpo-hints

Viewer • Updated 23 days ago • 700 • 55

sergiopaniego/pi-mono-chat

Viewer • Updated about 1 month ago • 886 • 107

sergiopaniego/requests-pr-diff

Viewer • Updated May 19 • 1 • 23

sergiopaniego/trl-r2e-test

Viewer • Updated May 18 • 1 • 35

sergiopaniego/chain-sum-rollouts

Viewer • Updated May 4 • 50 • 10

sergiopaniego/ttt-scripted-smoke

Viewer • Updated Apr 17 • 20 • 20

View 15 datasets