Towards a Science of Scaling Agent Systems

Yubin Kim, Ken Gu, Chanwoo Park, Chunjong Park, Samuel Schmidgall, A. Ali Heydari, Yao Yan, Zhihan Zhang, Yuchen Zhuang, Mark Malhotra, Paul Pu Liang, Hae Won Park, Yuzhe Yang, Xuhai Xu, Yilun Du, Shwetak Patel, Tim Althoff, Daniel McDuff, Xin Liu – 2025

Key findings

  • Tool-heavy tasks are challenging in multi-agent context
  • Single agent systems with decent performance (>45%) outperform multi-agent systems
  • Multi-agent systems are best for parallelizable tasks and worst for sequential tasks
  • Multi-agent token budgets are 1.6-6.2x those of single agent systems for same performance
  • Coordination costs scale super-linearly with environmental complexity
  • There are increasing returns to scaling model intelligence
  • Agent systems cap out at 3-4 agents in practice due to context window limitations

References

paperreadonline