ᴍᴀᴊᴏʀ ᴛᴏᴍ ꜰʟᴏᴀᴛɪɴɢ ɪɴ ᴛʜᴇ ʟᴀᴛᴇɴᴛ ꜱᴘᴀᴄᴇ 🧑🚀️
𝐆𝐥𝐨𝐛𝐚𝐥 𝐚𝐧𝐝 𝐃𝐞𝐧𝐬𝐞 𝐎𝐟𝐟𝐢𝐜𝐢𝐚𝐥 𝐄𝐦𝐛𝐞𝐝𝐝𝐢𝐧𝐠𝐬 𝐨𝐟 𝐌𝐚𝐣𝐨𝐫 𝐓𝐎𝐌 - 𝐎𝐩𝐞𝐧 𝐚𝐧𝐝 𝐅𝐫𝐞𝐞 𝐀𝐜𝐜𝐞𝐬𝐬 𝐨𝐧 @hf.co
Embedding Earth observation data with pre-trained AI models is starting to become a hot topic, for all the good reasons.
More info in thread 🧵
𝐆𝐥𝐨𝐛𝐚𝐥 𝐚𝐧𝐝 𝐃𝐞𝐧𝐬𝐞 𝐎𝐟𝐟𝐢𝐜𝐢𝐚𝐥 𝐄𝐦𝐛𝐞𝐝𝐝𝐢𝐧𝐠𝐬 𝐨𝐟 𝐌𝐚𝐣𝐨𝐫 𝐓𝐎𝐌 - 𝐎𝐩𝐞𝐧 𝐚𝐧𝐝 𝐅𝐫𝐞𝐞 𝐀𝐜𝐜𝐞𝐬𝐬 𝐨𝐧 @hf.co
Embedding Earth observation data with pre-trained AI models is starting to become a hot topic, for all the good reasons.
More info in thread 🧵
Comments
In a joint collaboration between ESA Φ-lab and CloudFerro, we expanded the standard for Major TOM to also support embedding derivative datasets:
https://arxiv.org/abs/2412.05600
💻 Codebase (Major TOM subpackage) https://github.com/ESA-PhiLab/Major-TOM/tree/main/src/embedder
🔥 Jupyter Notebook https://github.com/ESA-PhiLab/Major-TOM/blob/main/05-Generate-Major-TOM-Embeddings.ipynb
📑 Arxiv Preprint https://arxiv.org/abs/2412.05600
🌈 SSL4EO-S2L1C https://huggingface.co/datasets/Major-TOM/Core-S2L1C-SSL4EO
📡 SSL4EO-S1RTC https://huggingface.co/datasets/Major-TOM/Core-S1RTC-SSL4EO
📎 SigLIP-S2RGB https://huggingface.co/datasets/Major-TOM/Core-S2RGB-SigLIP
🦕 DINOv2-S2RGB https://huggingface.co/datasets/Major-TOM/Core-S2RGB-DINOv2
Joint work with @mkluczek.bsky.social & @jed0y.bsky.social
𝐑𝐞𝐚𝐝 𝐌𝐨𝐫𝐞:
https://arxiv.org/abs/2412.05600
Agreed, I think it is interesting to look at reconstruction loss, but I feel like some models will really shine too if they irreversibly remove some of visual information!
https://github.com/ESA-PhiLab/Major-TOM/blob/8b2ff3131076bb49429e66b9725a92f670a2af17/src/embedder/MajorTOM_Embedder.py#L82
Agree. I was thinking of the loss also as a way to proxy stuff models gave up encoding.
E.g. an unusual change might encode more on the increase of loss than on the change on embeddings.
True that decoders, and losses, can be unique to an encoder...