We periodically merge the low-rank adapters into the quantized model over exponentially increasing intervals. After each merge, we reinitialize the adapters and continue training.
We show LoQT works for both LLM pre-training and downstream task adaptation📊.
3/4
Post image

Comments