Some related work worth checking out if you find this cool.

https://mbaradad.github.io/shaders21k/ - learning good visual features from procedurally generated images.
https://arxiv.org/abs/2403.14494 - distillation from randomly weighted teachers.
Reposted from Dmytro Mishkin
What Makes a Good Dataset for Knowledge Distillation?
Logan Frank, Jim Davis

tl;dr: you can distill models on anything, but random noise.
arxiv.org/abs/2411.12817

Comments