esteng.bsky.social - Profile | ThreadSky | a Reddit-style client for Bluesky

🚨UPCORE is our new method for balancing unlearning/forgetting with maintaining model performance. Best part is it works by selecting a coreset from the data rather than changing the model, so it is compatible with any unlearning method, with consistent gains for 3 methods + 2 tasks!

submitted 7 days ago • 0 comments

🚨 Introducing UPCORE, to balance deleting info from LLMs with keeping their other capabilities intact. UPCORE selects a coreset of forget data, leading to a better trade-off across 2 datasets and 3 unlearning methods. 🧵👇

submitted 7 days ago • 1 comment

🚨 Check out "UTGen & UTDebug" for learning to automatically generate unit tests (i.e., discovering inputs which break your code) and then applying them to debug code with LLMs, with strong gains (>12% pass@1) across multiple models/datasets! (see details in 🧵👇) 1/4

submitted 27 days ago • 1 comment

🚨 Excited to announce UTGen and UTDebug, where we first learn to generate unit tests and then apply them to debugging generated code with LLMs, with strong gains (+12% pass@1) on LLM-based debugging across multiple models/datasets via inf.-time scaling and cross-validation+backtracking! 🧵👇

submitted 28 days ago • 0 comments

🚨 Excited to share: "Learning to Generate Unit Tests for Automated Debugging" 🚨 which introduces ✨UTGen and UTDebug✨ for teaching LLMs to generate unit tests (UTs) and debugging code from generated tests. UTGen+UTDebug yields large gains in debugging (+12% pass@1) & addresses 3 key questions: 🧵👇

submitted 28 days ago • 1 comment

🎉 Congrats to the awesome students, postdocs, & collaborators for this exciting batch of #ICLR2025 and #NAACL2025 accepted papers (FYI some are on the academic/industry job market and a great catch 🙂), on diverse, important topics such as: -- adaptive data generation environments/policies ... 🧵

submitted 36 days ago • 1 comment

🎉Very excited that our work on Persuasion-Balanced Training has been accepted to #NAACL2025! We introduce a multi-agent tree-based method for teaching models to balance: 1️⃣ Accepting persuasion when it helps 2️⃣ Resisting persuasion when it hurts (e.g. misinformation) arxiv.org/abs/2410.14596 🧵 1/4

submitted 40 days ago • 1 comment

🎉Congratulations to Prof. @mohitbansal.bsky.social on being named a 2025 @RealAAAI Fellow for "significant contributions to multimodal AI foundations & faithful language generation and summarization." 👏 16 Fellows chosen worldwide by cmte. of 9 past fellows & ex-president: aaai.org/about-aaai/a...

submitted 42 days ago • 0 comments

Congrats @mohitbansal.bsky.social for being selected to be part of this prestigious #AAAI Fellows group! Very well-deserved recognition of long-term contributions 🎉🎉

submitted 42 days ago • 0 comments

Congrats on this huge (and well-deserved) accomplishment @mohitbansal.bsky.social!

submitted 47 days ago • 1 comment

Deeply honored & humbled to have received the Presidential #PECASE Award by the @WhiteHouse and @POTUS office! 🙏 Most importantly, very grateful to my amazing mentors, students, postdocs, collaborators, and friends+family for making this possible, and for making the journey worthwhile + beautiful 💙

submitted 48 days ago • 5 comments

Apply soon to - work with tons of extremely talented and strong students and get hands-on experience mentoring - benefit from expert mentorship - be in a great department and live in an area with high quality of life and more! Feel free to ping me if you want to hear more from a current postdoc!

submitted 68 days ago • 0 comments

🚨 We have postdoc openings at UNC 🙂 Exciting+diverse NLP/CV/ML topics**, freedom to create research agenda, competitive funding, very strong students, mentorship for grant writing, collabs w/ many faculty+universities+companies, superb quality of life/weather. Please apply + help spread the word 🙏

submitted 71 days ago • 1 comment

✈️ I've landed in Vancouver for #NeurIPS2024 11/12: LACIE, a pragmatic speaker-listener method for training LLMs to express calibrated confidence: arxiv.org/abs/2405.21028 12/12: GTBench, a benchmark for game-theoretic abilities in LLMs: arxiv.org/abs/2402.12348 P.s. I'm on the faculty market👇

submitted 84 days ago • 1 comment

Working with Elias has been an absolute pleasure! His passion for research and dedication to mentoring are inspiring. Can’t wait to see all the amazing work his lab will do as he becomes a professor! ✨

submitted 88 days ago • 1 comment

Look out for @jmincho.bsky.social's application! He’s absolutely one of the top multimodal minds, I’ve been a fan of his work even before joining UNC and have been lucky enough to see him in action as a top-notch researcher and mentor!

submitted 86 days ago • 1 comment

🚨 I am on the faculty job market this year 🚨 I will be presenting at #NeurIPS2024 and am happy to chat in-person or digitally! I work on developing AI agents that can collaborate and communicate robustly with us and each other. More at: esteng.github.io and in thread below 🧵👇

submitted 89 days ago • 2 comments

Looking forward to giving this Distinguished Lecture at StonyBrook next week & meeting the several awesome NLP + CV folks there - thanks Niranjan‬ + all for the kind invitation 🙂 PS. Excited to give a new talk on "Planning Agents for Collaborative Reasoning and Multimodal Generation" ➡️➡️ 🧵👇

submitted 91 days ago • 1 comment

🚨Just read an exciting new paper from Justin 👇 on how to get the benefits of reasoning backwards (i.e. starting with the solution and tracing it back to the question) into a model while keeping the test-time cost of reasoning the same! 🧵1/3

submitted 92 days ago • 2 comments

Congratulations to #UNC CS student David Wan for winning the prestigious 2024 Google PhD Fellowship in NLP. 🎉🥳 A very well-deserved honor for his impactful work on factual and faithful text+multimodal generation with Prof. @mohitbansal.bsky.social and UNC NLP group! ▶️ blog.google/technology/r...

submitted 103 days ago • 0 comments

APPLY: TENURE-TRACK/TEACHING FACULTY. Research areas include but not limited to: ML, NLP, vision, graphics, systems, bioinformatics, security, medical imaging, robotics, RT systems, AR/VR. Join our team—committed to research, teaching, and collaboration. ➡️https://cs.unc.edu/faculty-hiring/

submitted 103 days ago • 0 comments

And it's a wrap at #EMNLP2024! Thanks everyone for joining us & making it memorable+engaging -- hope y'all enjoyed the program 🤗 💙 PS. A big thanks again to my co-chairs Thamar Solorio, Yun-Nung Chen, Yaser Al-Onaizan + the awesome committee members for all their effort! -- signing off, your PCs

submitted 106 days ago • 1 comment

New here? Interested in AI/ML? Check out these great starter packs! AI: go.bsky.app/SipA7it RL: go.bsky.app/3WPHcHg Women in AI: go.bsky.app/LaGDpqg NLP: go.bsky.app/SngwGeS AI and news: go.bsky.app/5sFqVNS You can also search all starter packs here: blueskydirectory.com/starter-pack...

submitted 115 days ago • 70 comments

The thirdUnImplicit workshop on Understanding Implicit and Underspecified Language will be collocated with EACL 2024 in Malta! Co-organized with Alisa Liu, @esteng.bsky.social , Daniel Fried and Sandro Pezzelle. Submission deadline: 18th of Dec Website: unimplicit2024.github.io

submitted 484 days ago • 0 comments

I'm super excited for this workshop, and think that it's more relevant than ever. Models are now performing well at a lot of language tasks (sometimes even better than people). Does this mean language is solved? No -- it usually means our tasks are too easy. 1/4

submitted 484 days ago • 1 comment