Just published by HuggingFace: The ultrascale playbook. How large model training is optimized on GPU clusters.
A lot of this is normally gospel and experience, so it's good to have everything explained in one place.
https://huggingface.co/spaces/nanotron/ultrascale-playbook
A lot of this is normally gospel and experience, so it's good to have everything explained in one place.
https://huggingface.co/spaces/nanotron/ultrascale-playbook
Comments