Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

Charlie Snell, Jaehoon Lee, Kelvin Xu, Aviral Kumar – 2024

Scaling test-time compute can be more effective than scaling model parameters (which increases both train and test time compute)
Mechanisms explored
- Search + reward model
- Adaptive logits
Effectiveness of test-time compute depends on task difficulty
- Implies test-time compute should be task dependent
With a good enough base model, a small model augmented with test-time compute can outperform a larger one

References