Profile avatar
ilyassmoummad.bsky.social
Deep Representation Learning for Ecology. Interested in Computer Vision, Natural Language Processing, and Machine Listening. GScholar: https://scholar.google.com/citations?user=&user=2hA2XZcAAAAJ
36 posts 297 followers 287 following
Prolific Poster
Conversation Starter

Long-audio understanding: Audio Flamingo 2 (AF2), using a custom CLAP model, synthetic data, and multi-stage curriculum learning, achieved state-of-the-art performance on over 20 benchmarks, including a new long-audio dataset (LongAudio).

Interesting talk by Yi Ma about the nature of intelligence, what have we done so far in AI, and what to do next: scds1001.dirk.hk/L-2.html

🎉 Celebrating 100,000 Modeled Taxa in the iNaturalist Open Range Map Dataset! To mark this milestone, we're making model-generated distribution data even more accessible. Explore, analyze, and use this data to power biodiversity research! 🌍🔍 www.inaturalist.org/posts/106918

Kernel Audio Distance (KAD), a new audio generation evaluation metric, was proposed, showing faster convergence, lower computational cost, and better alignment with human perception than Fréchet Audio Distance (FAD). It leverages MMD and advanced embeddings. GPU acceleration was used.

Are you still using LoRA to fine-tune your LLM? 2024 has seen an explosion of new parameter-efficient fine tuning technique (PEFT), thanks to clever uses of the singular value decomposition (SVD). Let's dive into the alphabet soup: SVF, SVFT, MiLoRA, PiSSA, LoRA-XS 🤯...

Launching: BioDCASE - the Bioacoustics Data Challenge! https://biodcase.github.io/ #DCASE #DCASE2025 #DCLDE #bioacoustics #ai4good

Postdoc job at Naturalis: "Postdoctoral Fellow in Machine Learning & Butterfly Ecology" https://www.naturalis.nl/en/about-us/job-opportunities/postdoctoral-fellow-in-machine-learning-butterfly-ecology (Beautiful museum, great work environment, plus The Netherlands :) #academicjobs #postdoc

Yi Ma & colleagues managed to simplify DINO & DINOv2 by removing many ingredients and adding a robust regularization term from information theory (coding rate) that learn informative decorrelated features. Happy to see principled approaches advance deep representation learning!

Want strong SSL, but not the complexity of DINOv2? CAPI: Cluster and Predict Latents Patches for Improved Masked Image Modeling.

🚀 "UNSURE: self-supervised learning with Unknown Noise level and Stein's Unbiased Risk Estimate" is accepted at #ICLR2025 A thread! 📜 Paper: arxiv.org/abs/2409.01985 🖥️ Code: github.com/tachella/uns...

"Compositional Entailment Learning for Hyperbolic Vision-Language Models" #ICLR25 With: Avik Pal, Max van Spengler, Guido Maria D'Amely di Melendugno, Alessandro Flaborea, and @pascalmettes.bsky.social Paper: arxiv.org/abs/2410.06912

Short introduction to optimal transport with a simple 2D discrete example. The video was done in Manim a few years ago and I have sadly lost the original code. youtu.be/Os1xkUlwjjo

📢 The short description of the tasks is now available on the website 👇 dcase.community/challenge2025/

Great to see how all the CV pioneers have thought about various CV problems back then and how 20 years of research have changed the view on most of these problems. There is still much left to do. It would be great to repeat this series to look back to in 20 years from today.

The Nadaraya-Watson estimator is linear local averaging estimator relying on a pointwise nonnegative kernel. Most of the time, a box or Gaussian kernel is used. https://www.jstor.org/stable/25049340?seq=2

When I was a kid I was fascinated by SETI, the Search for Extraterrestrial Intellitence. Now we live in an era when it is becoming meaningful to search for "extraterrestrial life" not just in our universe but in simulated universes as well. This project provides new tools toward that dream:

Schrödinger Bridge Flow for Unpaired Data Translation (by @vdebortoli.bsky.social et al.) It will take me some time to digest this article fully, but it's important to follow the authors' advice and read the appendices, as the examples are helpful and well-illustrated. 📄 arxiv.org/abs/2409.09347

My book is (at last) out, just in time for Christmas! A blog post to celebrate and present it: francisbach.com/my-book-is-o...

General structure of a paper: - general ideas - general case - general case - general case - what we actually do how it should be: - what we actually do - why we think it's great as one method of a general class - how we got there - how we got there - how we got there

Hyperbolic learning is growing rapidly by the day. From weekly alerts in 2023 to daily digests in 2024! From our current research, it is clear that 2025 will be a huge year for hyperbolic learning research. I had an interview to elaborate our research: shorturl.at/CQD53

Brilliant talk by Ilya, but he's wrong on one point. We are NOT running out of data. We are running out of human-written text. We have more videos than we know what to do with. We just haven't solved pre-training in vision. Just go out and sense the world. Data is easy.

🚀 Introducing the Byte Latent Transformer (BLT) – A LLM architecture that scales better than Llama 3 using patches instead of tokens 🤯 Paper 📄 dl.fbaipublicfiles.com/blt/BLT__Pat... Code 🛠️ github.com/facebookrese...

We @imagineenpc.bsky.social are slowly but surely entering our proposals for master's degree internships here: docs.google.com/document/d/1... These are 6 months projects that typically correspond to the end-of-study project in the French curriculum. Probably more offers to come, check it regularly.

I'm pleased to share that our recent paper with @2ptmvd has been accepted to the Philoshophical Transactions of the Royal Society. Here's the ‘Accepted Author Version’: drive.google.com/file/d/1jdtr... And here it is on arxiv without the fancy formatting: arxiv.org/abs/2409.06219 1/3

I started to put together a starter pack for research in AI+Ecology, check it out and let me know if you would like to be added! go.bsky.app/8zugFF6

How do language models organize concepts and their properties? Do they use taxonomies to infer new properties, or infer based on concept similarities? Apparently, both! 🌟 New paper with my fantastic collaborators @amuuueller.bsky.social and @kanishka.bsky.social

🎯 How can we empower scientific discovery in millions of nature photos? Introducing INQUIRE: A benchmark testing if AI vision-language models can help scientists find biodiversity patterns- from disease symptoms to rare behaviors- hidden in vast image collections. Thread👇🧵

The origins of "attention", which @karpathy.bsky.social correctly calls a "brilliant (data-dependent) weighted average operation", were not in machine learning - in fact this idea dates back to data-dependent "filters" in image processing from the 90s. 1/n

Hellinger and Wasserstein are the two main geodesic distances on probability distributions. While both minimize the same energy, they differ in their interpolation methods: Hellinger focuses on density, whereas Wasserstein emphasizes position displacements.

📢 Exciting news! My PhD defense titled "Invariant Representation Learning for Few-Shot Bioacoustic Event Detection and Classification" is happening this Monday, December 2nd, at 9 AM (CET). It'll be livestreamed on YT! 🎥 If you're interested, drop me a message for the link.

🤔 Can you turn your vision-language model from a great zero-shot model into a great-at-any-shot generalist? Turns out you can, and here is how: arxiv.org/abs/2411.15099 Really excited to this work on multimodal pretraining for my first bluesky entry! 🧵 A short and hopefully informative thread:

A really cool paper from Kyutai demonstrates how model capabilities can be extended to a new domain (e.g., learning a new language) while preserving the original capabilities. This is achieved by leveraging the concept of adapters.

NeurIPS Test of Time Awards: Generative Adversarial Nets Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio Sequence to Sequence Learning with Neural Networks Ilya Sutskever, Oriol Vinyals, Quoc V. Le

So awesome to see the evolution of SFX generation from the Adobe titans!

Shannon's entropy measures the uncertainty or information content in a probability distribution. It's a concept in data compression and communication introduced in the paper “A Mathematical Theory of Communication”. https://people.math.harvard.edu/~ctm/home/text/others/shannon/entropy/entropy.pdf