Cahlen Humphreys PRO
cahlen
AI & ML interests
☠️💻
Recent Activity
liked a model about 14 hours ago
openbmb/MiniCPM-V-4.6 liked a model 10 days ago
Zyphra/ZAYA1-8B repliedto unmodeled-tyler's post 10 days ago
Just started a fun project!
https://huggingface.co/datasets/unmodeled-tyler/DoW-UFO-UAP-1
I'm getting the recently released DoW UFO/UAP documents (https://war.gov/ufo) cleaned and converted into a dataset here on Hugging Face!
There 161 different files in the gov release (pdfs, images, videos, audio, etc) and my current plan is to do it all in 1 dataset with 4 different shards - that way you can just call whichever tables you want/need when you import the dataset.
This is an ongoing project (I'm doing it on the side + my regular projects) so it's a bit of a growing entity. I'll also continuously refine the data over time to make sure it's as clean as possible.
Check it out! Who knows what you'll find in there? Organizations
Animations
World Models
3D / Mesh
Gaussians and Nerfs
-
Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation
Paper • 2401.14257 • Published • 12 -
LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation
Paper • 2402.05054 • Published • 29 -
MVDiffusion++: A Dense High-resolution Multi-view Diffusion Model for Single or Sparse-view 3D Object Reconstruction
Paper • 2402.12712 • Published • 18 -
GaussianObject: Just Taking Four Images to Get A High-Quality 3D Object with Gaussian Splatting
Paper • 2402.10259 • Published • 15
Image Restoration
Surveys
TBR
Papers TO BE READ
-
3D-LLM: Injecting the 3D World into Large Language Models
Paper • 2307.12981 • Published • 40 -
Enhancing Multimodal Large Language Models with Vision Detection Models: An Empirical Study
Paper • 2401.17981 • Published • 1 -
SplaTAM: Splat, Track & Map 3D Gaussians for Dense RGB-D SLAM
Paper • 2312.02126 • Published • 2 -
Relightable Gaussian Codec Avatars
Paper • 2312.03704 • Published • 32
Object Detection
Multimodal
DLM
Datasets
Audio
Web Agents
Data Generation
3D Avatar Utils
-
Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance
Paper • 2401.15687 • Published • 24 -
Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians
Paper • 2312.03029 • Published • 27 -
DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation
Paper • 2312.13578 • Published • 29 -
Splatter Image: Ultra-Fast Single-View 3D Reconstruction
Paper • 2312.13150 • Published • 15
Spatial
LLM
-
Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains
Paper • 2402.05140 • Published • 23 -
BitDelta: Your Fine-Tune May Only Be Worth One Bit
Paper • 2402.10193 • Published • 21 -
QLoRA: Efficient Finetuning of Quantized LLMs
Paper • 2305.14314 • Published • 61 -
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
Paper • 2402.14658 • Published • 84
Video
-
VideoPrism: A Foundational Visual Encoder for Video Understanding
Paper • 2402.13217 • Published • 40 -
No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding
Paper • 2405.08344 • Published • 15 -
Helios: Real Real-Time Long Video Generation Model
Paper • 2603.04379 • Published • 186
Agents
-
LLM Agent Operating System
Paper • 2403.16971 • Published • 73 -
Real-Time Reasoning Agents in Evolving Environments
Paper • 2511.04898 • Published • 13 -
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications
Paper • 2508.16279 • Published • 63 -
Agentic-MME: What Agentic Capability Really Brings to Multimodal Intelligence?
Paper • 2604.03016 • Published • 37
AI OS
DLM
Animations
Datasets
World Models
Audio
3D / Mesh
Web Agents
Gaussians and Nerfs
-
Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation
Paper • 2401.14257 • Published • 12 -
LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation
Paper • 2402.05054 • Published • 29 -
MVDiffusion++: A Dense High-resolution Multi-view Diffusion Model for Single or Sparse-view 3D Object Reconstruction
Paper • 2402.12712 • Published • 18 -
GaussianObject: Just Taking Four Images to Get A High-Quality 3D Object with Gaussian Splatting
Paper • 2402.10259 • Published • 15
Data Generation
Image Restoration
3D Avatar Utils
-
Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance
Paper • 2401.15687 • Published • 24 -
Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians
Paper • 2312.03029 • Published • 27 -
DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation
Paper • 2312.13578 • Published • 29 -
Splatter Image: Ultra-Fast Single-View 3D Reconstruction
Paper • 2312.13150 • Published • 15
Surveys
Spatial
TBR
Papers TO BE READ
-
3D-LLM: Injecting the 3D World into Large Language Models
Paper • 2307.12981 • Published • 40 -
Enhancing Multimodal Large Language Models with Vision Detection Models: An Empirical Study
Paper • 2401.17981 • Published • 1 -
SplaTAM: Splat, Track & Map 3D Gaussians for Dense RGB-D SLAM
Paper • 2312.02126 • Published • 2 -
Relightable Gaussian Codec Avatars
Paper • 2312.03704 • Published • 32
LLM
-
Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains
Paper • 2402.05140 • Published • 23 -
BitDelta: Your Fine-Tune May Only Be Worth One Bit
Paper • 2402.10193 • Published • 21 -
QLoRA: Efficient Finetuning of Quantized LLMs
Paper • 2305.14314 • Published • 61 -
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
Paper • 2402.14658 • Published • 84
Object Detection
Video
-
VideoPrism: A Foundational Visual Encoder for Video Understanding
Paper • 2402.13217 • Published • 40 -
No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding
Paper • 2405.08344 • Published • 15 -
Helios: Real Real-Time Long Video Generation Model
Paper • 2603.04379 • Published • 186
Multimodal
Agents
-
LLM Agent Operating System
Paper • 2403.16971 • Published • 73 -
Real-Time Reasoning Agents in Evolving Environments
Paper • 2511.04898 • Published • 13 -
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications
Paper • 2508.16279 • Published • 63 -
Agentic-MME: What Agentic Capability Really Brings to Multimodal Intelligence?
Paper • 2604.03016 • Published • 37