note to the gpu poor: you can still train - ThreadSky | a Reddit-style client for Bluesky

timkellogg.me • 103 days ago

note to the gpu poor: you can still train

A new approach to training models in memory-constrained settings, LoQT allows for the pre-training of a 13B LLM on a 24GB GPU without model parallelism, checkpointing, or offloading strategies during training

Code: github.com/sebulo/LoQT