In “Abstracts,” VP Weizhu Chen discusses his team’s paper on how distinguishing between useful and “noisy” tokens during pretraining can improve token efficiency and model performance. The work was recognized as a best paper runner-up at NeurIPS 2024. https://www.microsoft.com/en-us/research/podcast/abstracts-neurips-2024-with-weizhu-chen/
Comments