From Frames to Clips: Efficient Key Clip Selection for Long-Form Video Understanding Paper • 2510.02262 • Published Oct 2, 2025 • 3
From Frames to Clips: Efficient Key Clip Selection for Long-Form Video Understanding Paper • 2510.02262 • Published Oct 2, 2025 • 3
view article Article How to generate text: using different decoding methods for language generation with Transformers patrickvonplaten • Mar 1, 2020 • 299
view article Article Illustrating Reinforcement Learning from Human Feedback (RLHF) +2 natolambert, LouisCastricato, lvwerra, Dahoas • Dec 9, 2022 • 416
singhalarchit/vit-base-oxford-iiit-pets Image Classification • 85.8M • Updated Mar 18, 2025 • 7
singhalarchit/vit-base-oxford-iiit-pets Image Classification • 85.8M • Updated Mar 18, 2025 • 7
google/vit-base-patch16-224 Image Classification • 86.6M • Updated Sep 5, 2023 • 4.9M • • 967
Qwen/Qwen2.5-VL-7B-Instruct Image-Text-to-Text • 8B • Updated Apr 6, 2025 • 6.12M • • 1.56k
nomic-ai/modernbert-embed-base Sentence Similarity • 0.1B • Updated Jan 24, 2025 • 157k • • 231
apple/aimv2-large-patch14-224 Image Feature Extraction • 0.3B • Updated Jul 8, 2025 • 1.36k • 62