koustuvsinha.com - Profile | ThreadSky | a Reddit-style client for Bluesky

koustuvsinha.com

🔬Research Scientist, Meta AI (FAIR). 🎓PhD from McGill University + Mila 🙇‍♂️I study Multimodal LLMs, Vision-Language Alignment, LLM Interpretability & I’m passionate about ML Reproducibility (@reproml.org) 🌎https://koustuvsinha.com/

17 posts 293 followers 434 following

Posts 16 Comments 13

The HuggingFace/Nanotron team just shipped an entire pretraining textbook in interactive format. huggingface.co/spaces/nanot... It’s not just a great pedagogic support, but many unprecedented data and experiments presented for the first time in a systematic way.

submitted 39 days ago • 0 comments

Excited to have two papers at #NAACL2025! The first reveals how human over-reliance can be exacerbated by LLM friendliness. The second presents a novel computational method for concept tracing. Check them out! arxiv.org/pdf/2407.07950 arxiv.org/pdf/2502.05704

submitted 39 days ago • 2 comments

👋 Hello world! We’re thrilled to announce the v0.4 release of fairseq2 — an open-source library from FAIR powering many projects at Meta. pip install fairseq2 and explore our trainer API, instruction & preference finetuning (up to 70B), and native vLLM integration.

submitted 47 days ago • 1 comment

I am shocked by the death of Felix Hill. He was one of the brightest minds of my generation. His last blog post on the stress of working in AI is very poignant. Apart from the emptiness of working mostly to make billionaires even richer, there's the intellectual emptiness of 'scale is all you need'

submitted 76 days ago • 0 comments

We posted our paper on arxiv recently, sharing this here too: arxiv.org/abs/2412.141... - work led by our amazing intern Peter Tong. Key findings: - LLMs can be trained to generate visual embeddings!! - VQA data appears to help a lot in generation! - Better understanding = better generation!

submitted 94 days ago • 1 comment

🚨 We are pleased to announce the first, in-person event for the Machine Learning Reproducibility Challenge, MLRC 2025! Save your dates: August 21st, 2025 at Princeton!

submitted 107 days ago • 3 comments

Our paper PRISM alignment won a best paper award at #neurips2024! All credits to @hannahrosekirk.bsky.social A.Whitefield, P.Röttger, A.M.Bean, K.Margatina, R.Mosquera-Gomez, J.Ciro, @maxbartolo.bsky.social H.He, B.Vidgen, S.Hale Catch Hannah tomorrow at neurips.cc/virtual/2024/poster/97804

submitted 109 days ago • 2 comments

Checkout the MLRC 2023 posters at #NeurIPS 2024 this week: reproml.org/proceedings/ - do drop by to these posters and say hi!

submitted 110 days ago • 1 comment

The return of the Autoregressive Image Model: AIMv2 now going multimodal. Excellent work by @alaaelnouby.bsky.social & team with code and checkpoints already up: arxiv.org/abs/2411.14402

submitted 129 days ago • 1 comment

For those who missed this post on the-network-that-is-not-to-be-named, I made public my "secrets" for writing a good CVPR paper (or any scientific paper). I've compiled these tips of many years. It's long but hopefully it helps people write better papers. perceiving-systems.blog/en/post/writ...

submitted 131 days ago • 4 comments

How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this: Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢 🧵⬇️

submitted 130 days ago • 38 comments

When I first read this paper, I instinctively scoffed at the idea. But the more I look at empirical results, the more I’m convinced this paper highlights something fundamentally amazing. Lots of exciting research on this direction will come very soon! arxiv.org/abs/2405.07987

submitted 131 days ago • 3 comments

All the ACL chapters are here now: @aaclmeeting.bsky.social @emnlpmeeting.bsky.social @eaclmeeting.bsky.social @naaclmeeting.bsky.social #NLProc

submitted 132 days ago • 1 comment

Doing good science is 90% finding a science buddy to constantly talk to about the project.

submitted 141 days ago • 22 comments

Asking lots of "dumb" questions isn't a sign of stupidity. If anything it is more likely to be the sign of a person who is very strict about always keeping a crystal clear mental model of the topic at hand.

submitted 133 days ago • 10 comments

New here? Interested in AI/ML? Check out these great starter packs! AI: go.bsky.app/SipA7it RL: go.bsky.app/3WPHcHg Women in AI: go.bsky.app/LaGDpqg NLP: go.bsky.app/SngwGeS AI and news: go.bsky.app/5sFqVNS You can also search all starter packs here: blueskydirectory.com/starter-pack...

submitted 142 days ago • 70 comments