The model's image representation is very strong for segmentation but attention visualization of the CLS token cannot show you the objects in the image like DINO V2 trained on ImageNet. Funny fact: This model's CLS token also attends to the bird if you give an image of a bird 😂 - ThreadSky

nusretozates.bsky.social • 96 days ago

The model's image representation is very strong for segmentation but attention visualization of the CLS token cannot show you the objects in the image like DINO V2 trained on ImageNet. Funny fact: This model's CLS token also attends to the bird if you give an image of a bird 😂

Comments

Posting Rules

Comments

Posting Rules

Reply