The global transformer shortage is a very serious issue. Please don't be wasteful! To do our part, the new @vectorinstitute.ai policy is to use at most 3 heads per multi-head attention layer. - ThreadSky

igilitschenski.bsky.social • 14 days ago

The global transformer shortage is a very serious issue. Please don't be wasteful! To do our part, the new
@vectorinstitute.ai policy is to use at most 3 heads per multi-head attention layer.