vmoens.bsky.social - Profile | ThreadSky | a Reddit-style client for Bluesky

comment in response to post

MLGym makes it super easy to set up complex tasks to be solved by LLMs. Honestly one of the most intuivite APIs I have ever seen in that space!

submitted 5 days ago

comment in response to post

After that, your LLM reads these instructions, and outputs prompts with some thoughts. The commands are executed in the docker's bash, and the result is returned to the agent.

submitted 5 days ago

comment in response to post

Good old cProfile with snakeviz is pretty cool too jiffyclub.github.io/snakeviz/ Again, not for cuda ops, and not as fine-grained as line-profiler but quite useful for macro-tracking of compute time

submitted 7 days ago

comment in response to post

torch.utils.benchmark.Timer is amazing to assess the runtime of a whole isolated piece of code, but be mindful that the way it plays with global variables isn't always obvious and may differ from time.time() on occasions

submitted 7 days ago

comment in response to post

I use line_profiler to check the code line-by-line (careful: cuda ops re async, do not trust it for these!) - very useful to check cpu-overhead pypi.org/project/line...

submitted 7 days ago

comment in response to post

The profilers I use: PyTorch profiler to view the time spend doing the various ops of my code. It can reliably show you what's going on for a single iteration of your function. pytorch.org/tutorials/re...

submitted 7 days ago

comment in response to post

In general, in-place operations are not preferable to regular ones (you won't gain much mem improvement or speed-ups). Don't load your code with ReLU(inplace=True), mul_, add_ if not absolutely necessary.

submitted 7 days ago

comment in response to post

Using hydra or similar fancy config objects: Avoid calling cfg.attribute often in the code. Instead, cache the args values in your script as global workspace variables.

submitted 7 days ago

comment in response to post

If you have a tiny model (robotics, RL) cpu-overhead bound, avoid frequent calls to eval() or train() in eager mode, or model.parameters() or anything that goes through your model. Prefer cached versions of these calls.

submitted 7 days ago

comment in response to post

Avoid calling tensor.item() in between cuda operations. This triggers a cuda synchronization and blocks your code. Do the logging after all code (forward / backward / optim) has completed. See how to find sync points here)

submitted 7 days ago

comment in response to post

Avoid pinning memory in your code unless you thoroughly tested that it accelerates runtime (see this tutorial for more info). As an aside, pin_memory is also less safe! pytorch.org/tutorials/in...

submitted 7 days ago

comment in response to post

Don't send tensors to device using to(device) if you can instantiate them directly there. For instance, prefer randn((), device=device) to randn(()).to(device)

submitted 7 days ago

comment in response to post

I guess my point was that a proper name + definition is necessary to write good code. When I see “policy”, “critic”, “replay buffer”, “env” I know exactly what does and doesn’t belong to them. With agent is systematically a “hm yeah why not” - then you end up with ill-defined monster classes

submitted 15 days ago

comment in response to post

If your agent is a policy call it policy, if it's a trainer call it trainer! If it's just a big undefined collection of methods, consider refactoring it...

submitted 15 days ago

comment in response to post

Every time I meet with people and someone talks about agent, there's at least one person who asks "what do you mean by agent?" or "you should not call that an agent".

submitted 15 days ago

comment in response to post

hard to tell, let's try :D

submitted 20 days ago

comment in response to post

Side note: we saw some nice adoption from DeepSeek-R1 reprod repos, which is humbling, if not thrilling! github.com/Jiayi-Pan/Ti...

submitted 21 days ago

comment in response to post

submitted 67 days ago

comment in response to post

Easily transform video files into PyTorch tensors with: 🎯User-friendly APIs 🎯Exceptional CPU and CUDA performance 🎯Advanced sampling capabilities tailored for ML training pipelines

submitted 76 days ago

comment in response to post

Links: NeurIPS page: neurips.cc/virtual/2024... GitHub: github.com/facebookrese... Paper: arxiv.org/abs/2312.01472

submitted 78 days ago

comment in response to post

Where: West Ballroom A-D poster 6510, Wednesday Dec. 11th from 11 a.m. PST to 2 p.m. PST We’d love to see you there — please come and say hi!

submitted 78 days ago

comment in response to post

Built on TorchRL and PyTorch, BenchMARL ensures high performance and state-of-the-art implementations, while its flexible configuration and standardized reporting make it a breeze to use.

submitted 78 days ago

comment in response to post

BenchMARL is a cutting-edge training library designed to bring standardized benchmarking to the world of Multi-Agent Reinforcement Learning (MARL). It allows for easy comparison across different algorithms, models, and environments, making it a game-changer for researchers and developers alike.

submitted 78 days ago

comment in response to post

When I'm not presenting, you can find me hanging around the Meta booth. Ping me if you want to chat about BricksRL or anything else!

submitted 79 days ago

comment in response to post

We'll be presenting our poster on Wednesday at 4:30 p.m. — 7:30 p.m. PST in Hall A-C 4210. Come say hi!

submitted 79 days ago

comment in response to post

Want to learn more about BricksRL? Check out our paper: arxiv.org/abs/2406.17490 And don't forget to visit our GitHub page: github.com/BricksRL/bri...

submitted 79 days ago

comment in response to post

This project is only 1% complete! We have so many ideas for fun stuff to do that we simply can't make it alone. If you want to collaborate, please reach out! Let's build something amazing together

submitted 79 days ago

comment in response to post

But that's not all! We're also serious about prototyping ideas using low-level, cheap, entry-level hardware. Showing that research ideas scale to the real world is crucial, and we think BricksRL can help make that happen.

submitted 79 days ago

comment in response to post

We believe @PyTorch should have a stronger place in STEM education. With BricksRL, we aim to provide a fun and interactive way to learn about control and Reinforcement Learning, inspiring the next generation of researchers and engineers

submitted 79 days ago

comment in response to post

Why Lego? It offers low-cost hardware for learning & experimenting with control or Reinforcement Learning in the real world. Standardized, reproducible, and fun! Plus, the Lego community is active and vibrant, making it a great fit for our project.

submitted 79 days ago

comment in response to post

I'm teaming up with Sebastian Ditter & @gdefabritiis.bsky.social to present our paper on BricksRL, a library that enables control of Lego robots using #PyTorch

submitted 79 days ago

comment in response to post

Don’t hesitate to DM me if interested!

submitted 81 days ago