Delighted to be a minor co-author on this work, led by Pranav Nair: Combining losses for different Matyroshka-nested groups of bits in each weight within a neural network leads to an accuracy improvement for models (esp. 2-bit reps). Paper: "Matryoshka Quantization" at arxiv.org/abs/2502.06786 - ThreadSky

About ThreadSky

jeffdean.bsky.social • 15 days ago

Delighted to be a minor co-author on this work, led by
Pranav Nair: Combining losses for different Matyroshka-nested groups of bits in each weight within a neural network leads to an accuracy improvement for models (esp. 2-bit reps).

Paper: "Matryoshka Quantization" at https://arxiv.org/abs/2502.06786

Comments

debayanin.bsky.social•14 days ago

Amazing @pranavn1008.bsky.social !!

ichrvk.bsky.social•15 days ago

Curious how this affects inference latency compared to standard quantization schemes. The nested structure must add some overhead, no?

jeffdean.bsky.social•15 days ago

Nested structure is only there during training. You can extract the 2-bit quantized model after training and then it behaves just like any other 2-bit quantized model.

jeffdean.bsky.social•15 days ago

Inspired by off-hand comment I made to Prateek Jain & Aditya Kusupati about their Matyroshka Representation Learning work (https://arxiv.org/abs/2205.13147): "In the same way that Matryoshka representations across different units work, I wonder if we could treat bits of each weight in a similar nested way".

jeffdean.bsky.social•15 days ago

(Sorry, didn't have their BlueSky handles when I posted)

Other paper authors, including co-first authors Pranav Nair (https://pranavn1008.bsky.social), and Puranjay Datta (https://puranjay1412.bsky.social), and Aditya Kusupati (https://adityakusupati.bsky.social) and Prateek Jain.

jeffdean.bsky.social•15 days ago

And welcome Prateek Jain ( @pjain9.bsky.social ) 🎉

Posting Rules

Be respectful to others
No spam or self-promotion
Stay on topic
Follow Bluesky's terms of service

Comments

Posting Rules

Reply