Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
- Website
- Community
- Solutions
Log In
Sign Up

Oliver Pfaffel's picture

Oliver Pfaffel

OliP

21world's profile picture

muhtasham's profile picture

Pent's profile picture

·

o1iv3r

AI & ML interests

None yet

Organizations

OliP 's collections 13

NewGen small LMs

HuggingFaceTB/SmolLM2-135M

Text Generation • 0.1B • Updated Feb 6, 2025 • 1.41M • 196
HuggingFaceTB/SmolLM2-360M-Instruct

Text Generation • 0.4B • Updated Sep 22, 2025 • 287k • 188
facebook/MobileLLM-125M

Updated May 5, 2025 • 797 • 133

2024 Papers of the year

The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31, 2024 • 119
SAM 2: Segment Anything in Images and Videos

Paper • 2408.00714 • Published Aug 1, 2024 • 123
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

Paper • 2403.09629 • Published Mar 14, 2024 • 79

Paused

272

Llm Pricing

📊

272

Display a React app with TypeScript
Running

Featured

1.05k

Can You Run It? LLM version

🚀

1.05k

Check if your GPU can run a chosen LLM model
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems

Paper • 2312.15234 • Published Dec 23, 2023 • 3
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models

Paper • 2407.11062 • Published Jul 10, 2024 • 10

LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference

Paper • 2407.14057 • Published Jul 19, 2024 • 46
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities

Paper • 2407.14482 • Published Jul 19, 2024 • 26
NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window?

Paper • 2407.11963 • Published Jul 16, 2024 • 44

Special LMs <10B

Salesforce/xLAM-1b-fc-r

Text Generation • 1B • Updated Apr 11, 2025 • 2.6k • 59
AI-MO/NuminaMath-7B-TIR

Text Generation • 7B • Updated Aug 14, 2024 • 282 • 351
google/shieldgemma-9b

Text Generation • 9B • Updated Sep 6, 2024 • 4.82k • • 28
meta-llama/Llama-Guard-3-8B

Text Generation • 8B • Updated Oct 11, 2024 • 65.8k • • 298

Self-Taught Evaluators

Paper • 2408.02666 • Published Aug 5, 2024 • 28
Michelangelo: Long Context Evaluations Beyond Haystacks via Latent Structure Queries

Paper • 2409.12640 • Published Sep 19, 2024 • 4
openai/MMMLU

Viewer • Updated Oct 16, 2024 • 393k • 11.7k • 522
HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models

Paper • 2409.16191 • Published Sep 24, 2024 • 41

SciCode: A Research Coding Benchmark Curated by Scientists

Paper • 2407.13168 • Published Jul 18, 2024 • 17
OpenDevin: An Open Platform for AI Software Developers as Generalist Agents

Paper • 2407.16741 • Published Jul 23, 2024 • 78
CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases

Paper • 2408.03910 • Published Aug 7, 2024 • 18
Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents

Paper • 2408.07060 • Published Aug 13, 2024 • 41

Leading Leaderboards

Running on CPU Upgrade

14k

Open LLM Leaderboard

🏆

14k

Track, rank and evaluate open LLMs and chatbots
Running on CPU Upgrade

7.43k

MTEB Leaderboard

🥇

7.43k

Embedding Leaderboard
Running

4.9k

Arena Leaderboard

🏆

4.9k

View the LMArena leaderboard in full‑screen
Running

Agents

230

BigCodeBench Leaderboard

🥇

230

Explore code-generation model leaderboards and task details

2023 (and before) Papers of the Year

Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles

Paper • 2306.00989 • Published Jun 1, 2023 • 1
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 66
Scalable Diffusion Models with Transformers

Paper • 2212.09748 • Published Dec 19, 2022 • 17
Matryoshka Representation Learning

Paper • 2205.13147 • Published May 26, 2022 • 27

Vision-Language

EVLM: An Efficient Vision-Language Model for Visual Understanding

Paper • 2407.14177 • Published Jul 19, 2024 • 45
ChartGemma: Visual Instruction-tuning for Chart Reasoning in the Wild

Paper • 2407.04172 • Published Jul 4, 2024 • 25
facebook/chameleon-7b

Image-Text-to-Text • 7B • Updated Jul 23, 2024 • 238k • 201
vidore/colpali

Visual Document Retrieval • Updated Nov 24, 2025 • 6.25k • 479

Stable Audio Open

Paper • 2407.14358 • Published Jul 19, 2024 • 27
Qwen2-Audio Technical Report

Paper • 2407.10759 • Published Jul 15, 2024 • 64
kyutai/moshiko-pytorch-bf16

Updated Sep 18, 2024 • 181k • 242
Presto! Distilling Steps and Layers for Accelerating Music Generation

Paper • 2410.05167 • Published Oct 7, 2024 • 18

Build error

Agents

Featured

137

Diffree

🖼

137
Runtime error

Agents

Featured

503

AnimateDiff

🐠

503

Integrating Large Language Models into a Tri-Modal Architecture for Automated Depression Classification

Paper • 2407.19340 • Published Jul 27, 2024 • 58
MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine

Paper • 2408.02900 • Published Aug 6, 2024 • 31
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery

Paper • 2408.06292 • Published Aug 12, 2024 • 128

NewGen small LMs

HuggingFaceTB/SmolLM2-135M

Text Generation • 0.1B • Updated Feb 6, 2025 • 1.41M • 196
HuggingFaceTB/SmolLM2-360M-Instruct

Text Generation • 0.4B • Updated Sep 22, 2025 • 287k • 188
facebook/MobileLLM-125M

Updated May 5, 2025 • 797 • 133

Leading Leaderboards

Running on CPU Upgrade

14k

Open LLM Leaderboard

🏆

14k

Track, rank and evaluate open LLMs and chatbots
Running on CPU Upgrade

7.43k

MTEB Leaderboard

🥇

7.43k

Embedding Leaderboard
Running

4.9k

Arena Leaderboard

🏆

4.9k

View the LMArena leaderboard in full‑screen
Running

Agents

230

BigCodeBench Leaderboard

🥇

230

Explore code-generation model leaderboards and task details

2024 Papers of the year

The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31, 2024 • 119
SAM 2: Segment Anything in Images and Videos

Paper • 2408.00714 • Published Aug 1, 2024 • 123
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

Paper • 2403.09629 • Published Mar 14, 2024 • 79

2023 (and before) Papers of the Year

Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles

Paper • 2306.00989 • Published Jun 1, 2023 • 1
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 66
Scalable Diffusion Models with Transformers

Paper • 2212.09748 • Published Dec 19, 2022 • 17
Matryoshka Representation Learning

Paper • 2205.13147 • Published May 26, 2022 • 27

Paused

272

Llm Pricing

📊

272

Display a React app with TypeScript
Running

Featured

1.05k

Can You Run It? LLM version

🚀

1.05k

Check if your GPU can run a chosen LLM model
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems

Paper • 2312.15234 • Published Dec 23, 2023 • 3
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models

Paper • 2407.11062 • Published Jul 10, 2024 • 10

Vision-Language

EVLM: An Efficient Vision-Language Model for Visual Understanding

Paper • 2407.14177 • Published Jul 19, 2024 • 45
ChartGemma: Visual Instruction-tuning for Chart Reasoning in the Wild

Paper • 2407.04172 • Published Jul 4, 2024 • 25
facebook/chameleon-7b

Image-Text-to-Text • 7B • Updated Jul 23, 2024 • 238k • 201
vidore/colpali

Visual Document Retrieval • Updated Nov 24, 2025 • 6.25k • 479

LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference

Paper • 2407.14057 • Published Jul 19, 2024 • 46
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities

Paper • 2407.14482 • Published Jul 19, 2024 • 26
NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window?

Paper • 2407.11963 • Published Jul 16, 2024 • 44

Stable Audio Open

Paper • 2407.14358 • Published Jul 19, 2024 • 27
Qwen2-Audio Technical Report

Paper • 2407.10759 • Published Jul 15, 2024 • 64
kyutai/moshiko-pytorch-bf16

Updated Sep 18, 2024 • 181k • 242
Presto! Distilling Steps and Layers for Accelerating Music Generation

Paper • 2410.05167 • Published Oct 7, 2024 • 18

Special LMs <10B

Salesforce/xLAM-1b-fc-r

Text Generation • 1B • Updated Apr 11, 2025 • 2.6k • 59
AI-MO/NuminaMath-7B-TIR

Text Generation • 7B • Updated Aug 14, 2024 • 282 • 351
google/shieldgemma-9b

Text Generation • 9B • Updated Sep 6, 2024 • 4.82k • • 28
meta-llama/Llama-Guard-3-8B

Text Generation • 8B • Updated Oct 11, 2024 • 65.8k • • 298

Build error

Agents

Featured

137

Diffree

🖼

137
Runtime error

Agents

Featured

503

AnimateDiff

🐠

503

Self-Taught Evaluators

Paper • 2408.02666 • Published Aug 5, 2024 • 28
Michelangelo: Long Context Evaluations Beyond Haystacks via Latent Structure Queries

Paper • 2409.12640 • Published Sep 19, 2024 • 4
openai/MMMLU

Viewer • Updated Oct 16, 2024 • 393k • 11.7k • 522
HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models

Paper • 2409.16191 • Published Sep 24, 2024 • 41

Integrating Large Language Models into a Tri-Modal Architecture for Automated Depression Classification

Paper • 2407.19340 • Published Jul 27, 2024 • 58
MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine

Paper • 2408.02900 • Published Aug 6, 2024 • 31
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery

Paper • 2408.06292 • Published Aug 12, 2024 • 128

SciCode: A Research Coding Benchmark Curated by Scientists

Paper • 2407.13168 • Published Jul 18, 2024 • 17
OpenDevin: An Open Platform for AI Software Developers as Generalist Agents

Paper • 2407.16741 • Published Jul 23, 2024 • 78
CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases

Paper • 2408.03910 • Published Aug 7, 2024 • 18
Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents

Paper • 2408.07060 • Published Aug 13, 2024 • 41

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs