Terminal Agent Research
Collection
Our research for small Terminal Agentic Models and Agentic datasets • 3 items • Updated • 1
Terminus is a model trained for terminal agentic tasks such as Terminal-Bench 2.0 and SWE-Bench, nd also be efficient for use and localization with environments such as Codex and OpenCode. It was trained on the dataset:
Terminus was designed to improve performance in terminal-based reasoning workflows, software engineering, and tool usage over other models.
| Model | Harness | Terminal-Bench 2.0 | SWE-Bench Verified |
|---|---|---|---|
| Qwen3-8B | Terminus-2 | 0.0 | 0.7 |
| Terminus-Qwen3-8b | Terminus-2 | 4.9 | 15.7 |
| Qwen3-32B | Terminus-2 | 1.9 | 5.7 |
| Qwen/Qwen3-Coder-30B-A3B-Instruct | OpenHands | 10.1 | 49.2 |
OpenAgent is an open-source effort focused on building stronger agentic models through better datasets, practical training, and real benchmark evaluation.