Profile avatar
vmoens.bsky.social
Research Engineer - PyTorch core - Meta@London - Open-source/open science advocate Maintainer of torchrl / tensordict / leanrl Former MD - Neuroscience PhD https://github.com/vmoens
85 posts 1,917 followers 638 following
Regular Contributor
Active Commenter
comment in response to post
MLGym makes it super easy to set up complex tasks to be solved by LLMs. Honestly one of the most intuivite APIs I have ever seen in that space!
comment in response to post
After that, your LLM reads these instructions, and outputs prompts with some thoughts. The commands are executed in the docker's bash, and the result is returned to the agent.
comment in response to post
Good old cProfile with snakeviz is pretty cool too jiffyclub.github.io/snakeviz/ Again, not for cuda ops, and not as fine-grained as line-profiler but quite useful for macro-tracking of compute time
comment in response to post
torch.utils.benchmark.Timer is amazing to assess the runtime of a whole isolated piece of code, but be mindful that the way it plays with global variables isn't always obvious and may differ from time.time() on occasions
comment in response to post
I use line_profiler to check the code line-by-line (careful: cuda ops re async, do not trust it for these!) - very useful to check cpu-overhead pypi.org/project/line...
comment in response to post
The profilers I use: PyTorch profiler to view the time spend doing the various ops of my code. It can reliably show you what's going on for a single iteration of your function. pytorch.org/tutorials/re...
comment in response to post
In general, in-place operations are not preferable to regular ones (you won't gain much mem improvement or speed-ups). Don't load your code with ReLU(inplace=True), mul_, add_ if not absolutely necessary.
comment in response to post
Using hydra or similar fancy config objects: Avoid calling cfg.attribute often in the code. Instead, cache the args values in your script as global workspace variables.
comment in response to post
If you have a tiny model (robotics, RL) cpu-overhead bound, avoid frequent calls to eval() or train() in eager mode, or model.parameters() or anything that goes through your model. Prefer cached versions of these calls.
comment in response to post
Avoid calling tensor.item() in between cuda operations. This triggers a cuda synchronization and blocks your code. Do the logging after all code (forward / backward / optim) has completed. See how to find sync points here)
comment in response to post
Avoid pinning memory in your code unless you thoroughly tested that it accelerates runtime (see this tutorial for more info). As an aside, pin_memory is also less safe! pytorch.org/tutorials/in...
comment in response to post
Don't send tensors to device using to(device) if you can instantiate them directly there. For instance, prefer randn((), device=device) to randn(()).to(device)
comment in response to post
I guess my point was that a proper name + definition is necessary to write good code. When I see “policy”, “critic”, “replay buffer”, “env” I know exactly what does and doesn’t belong to them. With agent is systematically a “hm yeah why not” - then you end up with ill-defined monster classes
comment in response to post
If your agent is a policy call it policy, if it's a trainer call it trainer! If it's just a big undefined collection of methods, consider refactoring it...
comment in response to post
Every time I meet with people and someone talks about agent, there's at least one person who asks "what do you mean by agent?" or "you should not call that an agent".
comment in response to post
hard to tell, let's try :D
comment in response to post
Side note: we saw some nice adoption from DeepSeek-R1 reprod repos, which is humbling, if not thrilling! github.com/Jiayi-Pan/Ti...
comment in response to post
comment in response to post
Easily transform video files into PyTorch tensors with: 🎯User-friendly APIs 🎯Exceptional CPU and CUDA performance 🎯Advanced sampling capabilities tailored for ML training pipelines
comment in response to post
Links: NeurIPS page: neurips.cc/virtual/2024... GitHub: github.com/facebookrese... Paper: arxiv.org/abs/2312.01472
comment in response to post
Where: West Ballroom A-D poster 6510, Wednesday Dec. 11th from 11 a.m. PST to 2 p.m. PST We’d love to see you there — please come and say hi!
comment in response to post
Built on TorchRL and PyTorch, BenchMARL ensures high performance and state-of-the-art implementations, while its flexible configuration and standardized reporting make it a breeze to use.
comment in response to post
BenchMARL is a cutting-edge training library designed to bring standardized benchmarking to the world of Multi-Agent Reinforcement Learning (MARL). It allows for easy comparison across different algorithms, models, and environments, making it a game-changer for researchers and developers alike.
comment in response to post
When I'm not presenting, you can find me hanging around the Meta booth. Ping me if you want to chat about BricksRL or anything else!
comment in response to post
We'll be presenting our poster on Wednesday at 4:30 p.m. — 7:30 p.m. PST in Hall A-C 4210. Come say hi!
comment in response to post
Want to learn more about BricksRL? Check out our paper: arxiv.org/abs/2406.17490 And don't forget to visit our GitHub page: github.com/BricksRL/bri...
comment in response to post
This project is only 1% complete! We have so many ideas for fun stuff to do that we simply can't make it alone. If you want to collaborate, please reach out! Let's build something amazing together
comment in response to post
But that's not all! We're also serious about prototyping ideas using low-level, cheap, entry-level hardware. Showing that research ideas scale to the real world is crucial, and we think BricksRL can help make that happen.
comment in response to post
We believe @PyTorch should have a stronger place in STEM education. With BricksRL, we aim to provide a fun and interactive way to learn about control and Reinforcement Learning, inspiring the next generation of researchers and engineers
comment in response to post
Why Lego? It offers low-cost hardware for learning & experimenting with control or Reinforcement Learning in the real world. Standardized, reproducible, and fun! Plus, the Lego community is active and vibrant, making it a great fit for our project.
comment in response to post
I'm teaming up with Sebastian Ditter & @gdefabritiis.bsky.social to present our paper on BricksRL, a library that enables control of Lego robots using #PyTorch
comment in response to post
Don’t hesitate to DM me if interested!