Obligatory "actually my lab invented test-time-compute" post. In "Stay on topic with Classifier-Free Guidance," we show that CFG enables a model to expend twice as much compute at inference time and match the performance of a model twice as large.
https://arxiv.org/abs/2306.17806
https://arxiv.org/abs/2306.17806
Comments