Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model Paper • 2603.21986 • Published 25 days ago • 123
dealignai/Gemma-4-31B-JANG_4M-CRACK Image-Text-to-Text • 6B • Updated about 3 hours ago • 143k • 1.24k
mistralai/Voxtral-Mini-4B-Realtime-2602 Automatic Speech Recognition • 4B • Updated Mar 11 • 877k • 821