Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
Charlie Snell, Jaehoon Lee, Kelvin Xu, Aviral Kumar – 2024
- Scaling test-time compute can be more effective than scaling model parameters (which increases both train and test time compute)
- Mechanisms explored
- Search + reward model
- Adaptive logits
- Effectiveness of test-time compute depends on task difficulty
- Implies test-time compute should be task dependent
- With a good enough base model, a small model augmented with test-time compute can outperform a larger one