Fascinating read on streaming RL in deep learning - arxiv.org/pdf/2410.14606 - ThreadSky

ThreadSky

About ThreadSky

karltuyls.bsky.social • 95 days ago

Fascinating read on streaming RL in deep learning - https://arxiv.org/pdf/2410.14606

Comments

sacha2.bsky.social•95 days ago

Saw Mr Sutton's tweet about the matter -- the paper looks very fascinating to read

sacha2.bsky.social•95 days ago

if smn is interested https://x.com/RichardSSutton/status/1860818651953463542

florentdelgrange.bsky.social•95 days ago

📌

antoine-mln.bsky.social•95 days ago

I’ve never seen this terminology before. Is it not just online RL?

tomdupuis.bsky.social•95 days ago

It's "true" online RL, ie batch size 1

antoine-mln.bsky.social•95 days ago

yeah that’s just online RL for infinite horizon. the batch updates often considered (especially if model-based) are an algorithmic choice which pretty much goes back to UCRL (iirc), not a problem definition

if you look at model-free you can find such updates, eg https://arxiv.org/abs/1910.07072

antoine-mln.bsky.social•95 days ago

I understand the emphasis for deep RL but since we have words to describe this we might as well use them idk

tomdupuis.bsky.social•95 days ago

I agree from the POV of theory that it's just a practical choice, but practical choices do matter much more than theory let believe... Actually most profound advances in practical results in deep RL come from understanding what practical details are important for SGD to start working properly in RL

tomdupuis.bsky.social•95 days ago

So I don't think we should just say "Yeah that's just online RL". Using streaming RL in the name emphasizes the departure from a batched data practice which is widely used currently, which is the whole point of the paper

weiyen.net•95 days ago

📌

Posting Rules

Be respectful to others
No spam or self-promotion
Stay on topic
Follow Bluesky's terms of service

Comments

Posting Rules

Reply