Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

Charlie Snell, Jaehoon Lee, Kelvin Xu, Aviral Kumar – 2024

  • Scaling test-time compute can be more effective than scaling model parameters (which increases both train and test time compute)
  • Mechanisms explored
    • Search + reward model
    • Adaptive logits
  • Effectiveness of test-time compute depends on task difficulty
    • Implies test-time compute should be task dependent
  • With a good enough base model, a small model augmented with test-time compute can outperform a larger one

References

paperreadonline