Prithiv Sakthi's picture

Building on HF

Prithiv Sakthi PRO

prithivMLmods

hugging-science

·

https://linktr.ee/prithivsakthi

AI & ML interests

computer vision, nlp, multimodality - HuggingFace Fellow ML 🤗

Recent Activity

reacted to lbourdois's post with 🔥 26 minutes ago

New blog post! An introduction to a little-known but highly effective model reduction method: 𝗧𝗿𝗶𝗺𝗺𝗶𝗻𝗴✂️ We show how to reduce model size (we went up to 87.24% reduction) while preserving its performance. We applied this technique to 16 different model families across several modalities to illustrate that it works on any architecture (as long as the embedding layer is the last one of the model) and on any modality involving text. From these 16 families, we generated over 𝟱,𝟱𝟬𝟬 𝗺𝗼𝗻𝗼𝗹𝗶𝗻𝗴𝘂𝗮𝗹 𝗺𝗼𝗱𝗲𝗹𝘀 𝗶𝗻 𝟭𝟮𝟰 𝗱𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝘁 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲𝘀 🌍 Key takeaways from our experiments: 1️⃣ Trimming does not require a GPU. Our models were obtained on a CPU. 2️⃣ This method scales up to at least 4B parameters (we did not test beyond that). 3️⃣ Trimmed model is smaller than the original while preserving its performance. If you observe a slight performance drop, just fine-tuned to recover or even surpass the original performance. 4️⃣ For an equivalent compute budget, it is better to trim then fine-tune rather than fine-tuning the original model. Since the model is smaller, you can run more epochs/show more data and get in fine a better model than the original. 5️⃣ Trimming is a competitive alternative to distillation and quantization. E.g. we obtained our alternative to DistilBERT in 9 minutes on CPU vs. 90 hours of GPU for the latter. 6️⃣ Trimming could generate reasoning traces in the language of the trimmed model. This could be an alternative to generating traces in English and then translating them into the desired language. And many other things (such as how much data are needed, the impact of the database used, the order in which it should be done, etc.) are available in the blogpost! Blogpost: https://huggingface.co/blog/lbourdois/introduction-to-trimming Models: https://huggingface.co/spaces/alphaedge-ai/Trimming_models_search

reacted to lbourdois's post with 🤗 26 minutes ago

New blog post! An introduction to a little-known but highly effective model reduction method: 𝗧𝗿𝗶𝗺𝗺𝗶𝗻𝗴✂️ We show how to reduce model size (we went up to 87.24% reduction) while preserving its performance. We applied this technique to 16 different model families across several modalities to illustrate that it works on any architecture (as long as the embedding layer is the last one of the model) and on any modality involving text. From these 16 families, we generated over 𝟱,𝟱𝟬𝟬 𝗺𝗼𝗻𝗼𝗹𝗶𝗻𝗴𝘂𝗮𝗹 𝗺𝗼𝗱𝗲𝗹𝘀 𝗶𝗻 𝟭𝟮𝟰 𝗱𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝘁 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲𝘀 🌍 Key takeaways from our experiments: 1️⃣ Trimming does not require a GPU. Our models were obtained on a CPU. 2️⃣ This method scales up to at least 4B parameters (we did not test beyond that). 3️⃣ Trimmed model is smaller than the original while preserving its performance. If you observe a slight performance drop, just fine-tuned to recover or even surpass the original performance. 4️⃣ For an equivalent compute budget, it is better to trim then fine-tune rather than fine-tuning the original model. Since the model is smaller, you can run more epochs/show more data and get in fine a better model than the original. 5️⃣ Trimming is a competitive alternative to distillation and quantization. E.g. we obtained our alternative to DistilBERT in 9 minutes on CPU vs. 90 hours of GPU for the latter. 6️⃣ Trimming could generate reasoning traces in the language of the trimmed model. This could be an alternative to generating traces in English and then translating them into the desired language. And many other things (such as how much data are needed, the impact of the database used, the order in which it should be done, etc.) are available in the blogpost! Blogpost: https://huggingface.co/blog/lbourdois/introduction-to-trimming Models: https://huggingface.co/spaces/alphaedge-ai/Trimming_models_search

updated a collection 32 minutes ago

MTP Qwen 3.5/3.6 Stable

View all activity

Organizations

New activity in prithivMLmods/gemma-4-26B-A4B-it-Uncensored-MAX about 7 hours ago

Loops/stuck

#1 opened about 1 month ago by

New activity in prithivMLmods/Qwen-Image-Edit-2511-LoRAs-Fast about 11 hours ago

add app [ref --2e8856a]

#15 opened about 11 hours ago by

New activity in prithivMLmods/PiD-Image-Upscaler 3 days ago

Download button for png output

#1 opened 3 days ago by

New activity in strangerzonehf/Qwen-Image-Edit-LoRA-Collection 3 days ago

add QIE-FRIE1.1-Object-Mover

#8 opened 3 days ago by

New activity in prithivMLmods/QIE-FRIE1.1-Object-Mover 3 days ago

add checkpoints ✅

#1 opened 3 days ago by

New activity in strangerzonehf/Qwen-Image-Edit-LoRA-Collection 4 days ago

add QIE-FRIE1.1-Deep-Grainy-Polaroid-4000

#7 opened 4 days ago by

New activity in prithivMLmods/QIE-FRIE1.1-Deep-Grainy-Polaroid 4 days ago

add LoRA

#2 opened 4 days ago by

add checkpoints ✅

#1 opened 4 days ago by

New activity in strangerzonehf/Qwen-Image-Edit-LoRA-Collection 4 days ago

add QIE-FRIE1.1-Neutral-Exposure-3000

#6 opened 4 days ago by

add QIE-FRIE1.1-BBox-Object-Remover [version 4]

#5 opened 4 days ago by

New activity in prithivMLmods/QIE-FRIE1.1-Neutral-Exposure 4 days ago

add checkpoints ✅

#1 opened 4 days ago by

add LoRA

#2 opened 4 days ago by

New activity in bartowski/Qwen_Qwen3.6-35B-A3B-GGUF 4 days ago

Single GGUF file that contains both the main model (trunk) and the MTP heads !!

#6 opened 6 days ago by

New activity in Brian6145/Qwen3.6-27B-Claude-Opus-Sonnet-DistilledV2-MTP-GGUF 4 days ago

Main model (trunk) and the MTP heads in a single GGUF file??

#2 opened 5 days ago by

New activity in prithivMLmods/QIE-FRIE1.1-BBox-Object-Remover-v4 4 days ago

add LoRA

#2 opened 4 days ago by

update checkpoints ✅

#1 opened 4 days ago by

New activity in prithivMLmods/QIE-2509-Object-Mover-Bbox 5 days ago

Help needed for selection boxes

#1 opened about 2 months ago by

New activity in prithivMLmods/Qwen-Image-Edit-2511-Polaroid-Photo 5 days ago

Size of output

#1 opened 2 months ago by

New activity in prithivMLmods/Qwen-Image-Edit-2511-Ultra-Realistic-Portrait 5 days ago

Size of output

#1 opened 2 months ago by

New activity in prithivMLmods/QIE-2511-Zoom-Master 5 days ago

Size output

#2 opened 2 months ago by