This paper masks out principal components instead of RGB patches because
(1) visible pixels may be redundant with masked ones,
(2) visible pixels may not be predictive of masked regions.
+38% on classification tasks.
I wonder how much CroCo & *ST3R might benefit from this.
https://arxiv.org/abs/2502.06314
(1) visible pixels may be redundant with masked ones,
(2) visible pixels may not be predictive of masked regions.
+38% on classification tasks.
I wonder how much CroCo & *ST3R might benefit from this.
https://arxiv.org/abs/2502.06314
1 / 2
Comments
@weinzaepfelp.bsky.social @vincentleroy.bsky.social