Astonishing how many RL bottlenecks are resolved simply by “make simulator go fast”. What if we had prioritized engineering over algorithms years ago? - ThreadSky

eugenevinitsky.bsky.social • 69 days ago

Astonishing how many RL bottlenecks are resolved simply by “make simulator go fast”. What if we had prioritized engineering over algorithms years ago?

Comments

ffelten.bsky.social•69 days ago

The Bitter lesson comes back all the time. For RL, it is about time we recognize how underutilized our hardware is. The JAX based RL stuffs opened the way, but there is much more work ahead on parallel RL algos.

swalton.ai•69 days ago

I think this is what Joseph Suarez is trying to do with https://puffer.ai he posts a lot more on Twitter, not sure if he made it over here

eugenevinitsky.bsky.social•68 days ago

Yeah Joseph is a friend and I think he's mostly right about this. Sad he hasn't made it over here

sethkarten.ai•68 days ago

Make simulator go fast scales nicely with make environment more interesting. Both are necessary

hyperpotatoneo.bsky.social•69 days ago

This isn’t a general solution to RL. The point is to make learning algorithms sample efficient. If the environment you are doing RL on is the real world, you can’t make the “environment go fast”.

With “infinite samples”, you can random sample policies till you stumble on one with high reward.

eugenevinitsky.bsky.social•68 days ago

I don't think it's a general solution to RL but it is a way to make good policies for many problems I care about and in which I don't particularly care about real world learning

khurramjaved.com•68 days ago

If one only cares about learning in simulators then they can simplify the problem. E.g., assume they have a perfect model, the environment state, and the ability to jump to arbitrary states.

This simpler setting is solved from a research perspective imo which is why engineering is the bottleneck.

eugenevinitsky.bsky.social•68 days ago

Hmm, can you clarify what you mean by "solved from a research perspective"? I would say that even in that domain we don't always know how to efficiently construct a policy

hyperpotatoneo.bsky.social•68 days ago

You’re correct, there’s plenty of simulated environments we can’t solve yet. But do you consider having 1 million parallel instances of an environment sped up 100x solving it with PPO with low wall clock time a desirable solution?

khurramjaved.com•68 days ago

The recipe that combines lots of experience and compute with existing algorithms works in the simulation regime (OpenAI Five, AlphaStar).

We can make the recipe more efficient but there is no research bottleneck imo.

Real-world learning requires new ideas. Existing algorithms completely fail.

khurramjaved.com•68 days ago

Although the distinction between real-world vs simulation is not the right one. The right abstraction is big worlds vs small worlds [1]. We don't have algorithms that can learn in big worlds.

[1] The Big World Hypothesos and its Ramifications
https://openreview.net/pdf?id=Sv7DazuCn8

eugenevinitsky.bsky.social•67 days ago

Okay I'm glad I asked because this makes clear the disagreement. A few things:
1) I don't consider either of those solved. OpenAI Five did not appropriately restrict the click rate, it's questionable whether AlphaStar reached superhuman level
2) Even given that, there are many problems that fall
1/2

dhruvbatra.bsky.social•68 days ago

Some of us did :)

eugenevinitsky.bsky.social•68 days ago

For sure! Habitat, your work, and a lot of Erik’s work was a lot of what made me believe this was the right way for now

carlo.pinciroli.net•69 days ago

I have tried and failed to make a similar argument for years. Every NSF proposal I submit about SoftEng for robotics gets destroyed because it's understood as mere development. It's much more, making the right tools yields thinking frameworks that become productivity and creativity amplifiers.

kalhansb.bsky.social•68 days ago

Some people don’t see my efforts to create better simulations during my PhD as "real science." I get that it’s not technically science in itself, but you need better simulations to do better science!

eugenevinitsky.bsky.social•68 days ago

Yeah, this really speaks to me here. I find it hard to convert this belief on the transformative power of tools into grant proposals as well

rdvquantum.bsky.social•53 days ago

I'm a systems person (though not in your field). I agree 110%. You shouldn't be doing pretty much anything without understanding what kind of system it takes to do it well. And yet:
https://www.microsoft.com/en-us/research/uploads/prod/2020/11/Leiserson-et-al-Theres-plenty-of-room-at-the-top.pdf

rdvquantum.bsky.social•53 days ago

I've been arguing pretty much the same thing in quantum for over two decades now. Too many people in #quantumcomputing know so little about systems engineering and architecture that they don't know what they don't know.

tkukurin.bsky.social•69 days ago

have people not been doing that for a while, e.g. self driving? in the limit of inf storage just have an infinite lookup table. still, quite a few representation learning advances were needed to get where we are today.

Comments

Posting Rules

Reply