This paper masks out principal components instead of RGB patches because (1) visible pixels may be redundant with masked ones, (2) visible pixels may not be predictive of masked regions. +38% on classification tasks. I wonder how much CroCo & *ST3R might benefit from this. arxiv.org/abs/2502.06314 - ThreadSky

chrisoffner3d.bsky.social • 15 days ago

This paper masks out principal components instead of RGB patches because
(1) visible pixels may be redundant with masked ones,
(2) visible pixels may not be predictive of masked regions.

+38% on classification tasks.

I wonder how much CroCo & *ST3R might benefit from this.
https://arxiv.org/abs/2502.06314

1 / 2

Comments

Posting Rules

Comments

Posting Rules

Reply