Natural language guidance of high-fidelity text-to-speech with synthetic annotations Paper • 2402.01912 • Published Feb 2, 2024 • 14
TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents Paper • 1901.08149 • Published Jan 23, 2019 • 4
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 38
HuggingFace's Transformers: State-of-the-art Natural Language Processing Paper • 1910.03771 • Published Oct 9, 2019 • 23
Datasets: A Community Library for Natural Language Processing Paper • 2109.02846 • Published Sep 7, 2021 • 15
DistilBERT release Collection Original DistilBERT model, checkpoints obtained from using teacher-student learning from the original BERT checkpoints. • 6 items • Updated Apr 17, 2024 • 44
view article Article SmolLM - blazingly fast and remarkably powerful +1 loubnabnl, anton-l, eliebak • Jul 16, 2024 • 457
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4, 2025 • 259
Pythia Scaling Suite Collection Pythia is the first LLM suite designed specifically to enable scientific research on LLMs. To learn more see https://github.com/EleutherAI/pythia • 18 items • Updated Feb 26, 2025 • 33
Encoders vs Decoders: the Ettin Suite Collection A collection of SOTA, open-data, paired encoder-only and decoder only models ranging from 17M params to 1B. See the paper at https://arxiv.org/abs/250 • 30 items • Updated Mar 2 • 29
view article Article SmolLM3: smol, multilingual, long-context reasoner +21 eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf • Jul 8, 2025 • 777