GAIA: a benchmark for General AI Assistants
Paper
• 2311.12983
• Published • 246
Viewer
• Updated • 932 • 33.5k
• 631
Viewer
• Updated • 253 • 3.89k
• 123
AppAgent: Multimodal Agents as Smartphone Users
Paper
• 2312.13771
• Published • 54
GPT-4V(ision) is a Generalist Web Agent, if Grounded
Paper
• 2401.01614
• Published • 22
WebVoyager: Building an End-to-End Web Agent with Large Multimodal
Models
Paper
• 2401.13919
• Published • 32
LARP: Language-Agent Role Play for Open-World Games
Paper
• 2312.17653
• Published • 33
Viewer
• Updated • 1.23k • 3.34k
• 81
TravelPlanner: A Benchmark for Real-World Planning with Language Agents
Paper
• 2402.01622
• Published • 38
A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for
Verifiers of Reasoning Chains
Paper
• 2402.00559
• Published • 3
TradingAgents: Multi-Agents LLM Financial Trading Framework
Paper
• 2412.20138
• Published • 33
RAG-Anything: All-in-One RAG Framework
Paper
• 2510.12323
• Published • 71
PaperBanana: Automating Academic Illustration for AI Scientists
Paper
• 2601.23265
• Published • 222