Profile avatar
andrewgwils.bsky.social
Machine Learning Professor https://cims.nyu.edu/~andrewgw
51 posts 2,261 followers 167 following
Regular Contributor
Active Commenter

These DeepSeek results mostly just reflect the diminishing gap between open and closed models, such that any company with billions can start with llama as a baseline, make some tweaks, and appear like the next OpenAI. Going forward, data and scale won't be the decisive advantage.

It's not the size of your parameter space that matters, it's how you use it.

With interview season coming, don't despair. I conspicuously forgot the name of the place I was interviewing in a 1-1. I made sure to name drop the university a bunch in my job talk right after, just so my allies could be like "he really does know the name".

There's apparently another Andrew Wilson at NYU who teaches piano lessons. I get a lot of emails meant for him. Maybe I'll charge his rate minus $1.

📢 My team at Meta (including Yaron Lipman and Ricky Chen) is hiring a postdoctoral researcher to help us build the next generation of flow, transport, and diffusion models! Please apply here and message me: www.metacareers.com/jobs/1459691...

We're excited to announce the ICML 2025 call for workshops! The CFP and submission advice can be found at: icml.cc/Conferences/.... The deadline is Feb 10. Submit some creative proposals!

Happy New Year everyone! Excited for the year ahead.

Many of the greatest papers, now canonical works, have a story of resistance, tension, and, finally, a crucial advocate. It's shockingly common. Why is there a bias against excellence? And what happens to those papers, those people, when no one has the courage to advocate?

Research scientists using industry GPUs these days... "But Mr Garnier… we're scientists, we want to change the world. You have the finest GPUs that money can buy! You employ 3000 research staff." www.youtube.com/watch?v=hdHF...

This is your monthly reminder that understanding deep learning does not require rethinking generalization, and it never did.

So excited about this new work on Bayesian optimization for antibody design! It works by teaching a generative model how the human immune system evolves antibodies for strong and stable binders. Satisfying mix of ML+Bio. Check out the great thread from @alannawzadamin.bsky.social and the paper!

Excited for the #NeurIPS2024 workshops today! I'll be speaking at: (1) Science of DL (panel, 3:10-4:10, scienceofdlworkshop.github.io/schedule/) (2) "Time Series in the Age of Large Models" (talk, 4:39-5:14, neurips-time-series-workshop.github.io).

New model trained on new dataset of nearly a million evolving antibody families at AIDrugX workshop Sunday at 4:20 pm (#76) #Neurips! Collab between @andrewgwils.bsky.social and BigHatBio. Stay tuned for full thread on how we used the model to optimize antibodies in the lab in coming days!

It feels like _so_ much time has passed since NeurIPS in New Orleans last year. We're in a different universe.

Nice crowd and lots of engagement at our NeurIPS poster today, with Sanae Lotfi presenting on token-level generalization bounds for LLMs! arxiv.org/abs/2407.18158

Every year at NeurIPS, I get a sense of where the community is headed. I'm so happy that the era of larger language models on larger datasets is coming to an end.

Is wearing a scarf indoors a power move?

The logo needs more affirmation!

I wanted to make my first post about a project close to my heart. Linear algebra is an underappreciated foundation for machine learning. Our new framework CoLA (Compositional Linear Algebra) exploits algebraic structure arising from modelling assumptions for significant computational savings! 1/4