shikharmurty.bsky.social - Profile | ThreadSky | a Reddit-style client for Bluesky

shikharmurty.bsky.social

Final year PhD Student in Computer Science @Stanford Work on: - Compositionality, syntax (language structure) - Web Agents: Synthetic data, tree search, exploration (language interpretation)

24 posts 443 followers 124 following

Posts 15 Comments 15

Ever dreamed of AI agents learning through interacting with the open world unsupervisedly? Our latest preprint introduces NNetNav-Live which collects training data through exploration on real websites and hindsight labeling, which produces a SOTA OSS agent.

submitted 22 days ago • 1 comment

Want to make a browser agent for any domain like banking or healthcare? We propose methods for training LLMs with open-ended, unsupervised interaction on live websites: ✅ OSS SoTA on WebVoyager ✅ world's smallest high-performing web-agent Try it here: nnetnav.dev

submitted 22 days ago • 1 comment

going to stay off twitter for my own mental health. something has gone horribly wrong with that platform.

submitted 62 days ago • 0 comments

Couldn't make it to NeurIPS due to work, but do check out our workshop happening in West Ballroom B. Lots of cool things to come, including a very fun panel!

submitted 75 days ago • 0 comments

Come visit our poster "MoEUT: Mixture-of-Experts Universal Transformers" on Friday at 4:30 in East Exhibit Hall A-C #1907 on #NeurIPS2024. With Kazuki Irie, Jürgen Schmidhuber, Christopher Potts and @chrmanning.bsky.social.

submitted 78 days ago • 1 comment

The extraordinary recent takeover of ML/AI by #NLP is well-known but insufficiently reflected on. Look at the @neuripsconf.bsky.social tutorials in 2024! neurips.cc/virtual/2024... 14 tutorials; 6 have "LLM" in the title; 4 more cover foundation models, with large NLP coverage. That's > 70% 😲

submitted 81 days ago • 1 comment

🚨 Thrilled to share that Compositional Generalization Across Distributional Shifts with Sparse Tree Operations received a spotlight award at #NeurIPS2024! 🌟 I'll present a poster on Tuesday and give an invited lightning talk at the System 2 Reasoning Workshop on Sunday. 🧵👇

submitted 81 days ago • 1 comment

🧵-1 We are thrilled to release #AgentLab, a new open-source package for developing and evaluating web agents. This builds on the new #BrowserGym package which supports 10 different benchmarks, including #WebArena.

submitted 87 days ago • 2 comments

Folks, I'm not going to be at Neurips this year, but we have an awesome workshop that i'm super proud of. Go attend, and use the link below to ask all of your burning questions about LLM reasoning, agents and compositionality!

submitted 87 days ago • 0 comments

🎊Excited for #neurips2024 and our "System 2 Reasoning at Scale" workshop. We have an excited lineup of speakers who will answer your most burning questions about AI and reasoning 🚀 🔥Got spicy questions? Submit & vote here: app.sli.do/event/dJNU63...

submitted 87 days ago • 1 comment

I also wear the AI agents researcher hat. Can't say i'm similarly impressed by reviewers in that community...

submitted 92 days ago • 0 comments

ACL syntax track reviewers >> almost any other conference. These folks care about their sub-field and i learn something new every time!

submitted 93 days ago • 1 comment

What is a probing task that is purely about semantics? Context: I have a probe trained to predict dependency relations, and would like to train another one on a semantics only task (for research purposes)

submitted 96 days ago • 3 comments

Asked GPT-4o to draw parse trees in two languages:

submitted 99 days ago • 1 comment

Hot take (since it's still just friends on this platform): It's crazy how the classic "sample and rerank" baseline from machine translation and IR, got re-branded as "scaling up inference-time compute".

submitted 99 days ago • 1 comment