The irony of reinforcement learning in ML is that an ostensibly real-time, online form of learning requires random sampling from an offline buffer of perfect memories.

Is there reason to assume this happens in biological systems? Perhaps this is what the hippocampus-accumbens projection helps with?

Comments