The next paper I saw was on continuous chain of thought, creating new latent thoughts that are much more expressive and allow the model to compress its thinking by an OOM.
https://arxiv.org/abs/2412.06769

Comments