SigLino: Vision Foundation Models (SigLIP2 + DINOv3) Collection Vision encoders distilled from DINOv3 and SigLIP2 (MoE & Dense). CVPR 2026. • 6 items • Updated 3 days ago • 17
VINCIE Collection A diffusion transformer model for in-context image generation and editing • 6 items • Updated 25 days ago • 12
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation Paper • 2511.14993 • Published Nov 19, 2025 • 233
view article Article Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face +3 Jul 29, 2025 • 221
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models Paper • 2507.13344 • Published Jul 17, 2025 • 59
Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation Paper • 2507.02608 • Published Jul 3, 2025 • 22
Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation Paper • 2507.02608 • Published Jul 3, 2025 • 22 • 3