view article Article Scaling Mixture of Experts: Architecture Search for Billion-Parameter Language Models kshitijthakkar • Feb 9 • 2
view article Article Systematic Architecture Search for Mobile-Optimized Mixture of Experts Language Models kshitijthakkar • Feb 6 • 2