MBZUAI/CoME-VL
Image-Text-to-Text • Updated • 14
Natural Language Processing, Machine Learning, and Computer Vision
LinguDistill: Recovering Linguistic Ability in Vision- Language Models via Selective Cross-Modal Distillation
CarePilot: A Multi-Agent Framework for Long-Horizon Computer Task Automation in Healthcare