Profile avatar
stonet2000.bsky.social
PhDing @UCSanDiego @HaoSuLabUCSD @hillbot_ai on scalable robot learning, reinforcement learning, and embodied AI. Co-founded @LuxAIChallenge to build AI competitions. @NSF GRFP fellow http://stoneztao.com
78 posts 2,686 followers 196 following
Regular Contributor
Active Commenter

diffusion policy learning to draw triangles on a canvas from vision, code for this was just released! Not too hard in fact, and I think drawing environments have some nice opportunities of analysis because there are many ways to draw a triangle. Same with cleaning envs (uploading those soon-ish)

Excited to share that I’ll be joining UC San Diego for my PhD, advised by Professor Hao Su! Many thanks to everyone who helped me along my research journey so far — I’m looking forward to continuing research in robot learning, manipulation, and simulation!

direct sim2real of a RGB dextrous manipulation policy trained via RL + distillation! Just from the videos you can see a big advantage of RL over pure imitation learning is learning fast behaviors+solve precise tasks more easily parallelrendering is doing a lot of wonders for people's research!

direct sim2real of a RGB dextrous manipulation policy trained via RL + distillation! Just from the videos you can see a big advantage of RL over pure imitation learning is learning fast behaviors+solve precise tasks more easily parallelrendering is doing a lot of wonders for people's research!

2 papers led by two UCSD undergrads I advised/co-advised were accepted to ICLR! Incredibly proud of both of them and will share/repost threads on them later

holy moly!

I have a draft of my introduction to cooperative multi-agent reinforcement learning on arxiv. Check it out and let me know any feedback you have. The plan is to polish and extend the material into a more comprehensive text with Frans Oliehoek. arxiv.org/abs/2405.06161

all the robotics related companies highlighted by Jensen at #CES2025 notably a very high proportion of 4/6 robot brain/foundation model focused companies are started by current professors (including my own advisor!). Covariant, Hillbot, Physical Intelligence, and Skild AI

I probably don’t need to tell you that 2024 was a huge year for robotics. As a long-time robotics researcher, it’s been amazing to watch; some of the things that I always dreamed about actually seem to be happening. For me, there are three big stories: itcanthink.substack.com/p/2024-robot...

given recent news i recall peoole say “flying is safer than driving statistically” yet many think flying is more dangerous. But I wonder if this is because humans model “safety” as probability of survival *given* an incident occurs. Under this model fear of eg guns/flight appear more rational now

Silvio came up in a conversation and he’s labeled as “Fei-Fei Li’s husband” interesting label for a CS professor at stanford

Weirdly proud of having made adversaries in 2024, feels like the right outcome in a world where not everyone is doing good faith science

This makes me think that the user-LLM feedback loop might be the ultimate inference time scaling we need. Fast loop = more prompting and refinement with the user aware of each step = better responses (assume user can verify it). Maybe Gemini flash as a consumer product still wins

Stone has being doing some of the coolest sim work for a while now so this thorough simulator comparison is really helpful. It also matches up: there's usually no free lunch in simulation so if you claim a speedup you need to drop something somewhere and it looks like what they dropped was physics

Yesterday the hyped Genesis simulator released. But it's up to 10x slower than existing GPU sims, not 10-80x faster or 430,000x faster than realtime since they benchmark mostly static environments blog post with corrected open source benchmarks & details: stoneztao.substack.com/p/the-new-hy...

Anything that uses ManiSkill >>>> Everything else

🎉 Our new work tackling long horizon low-level manipulation in apartments is out! ~500GB of demonstration data in sim (you can generate more) and RL/IL baselines all provided. ManiSkill helped make this project scalable and faster to run via GPU sim compared to alternatives

Most don’t know this but i’ve been fencing for about 14 years now. While I no longer compete, it’s still great to be involved by helping referee in local tournaments 🤺 always good to take a break from research

completely unacceptable. Worst part is after the speaker at NeurIPS is asked about this slide, she continues to reinforce the bias instead of recognizing her mistake