Today’s RL algorithms are usually not great at long-term planning in complex environments, mainly because long-term planning in complex env’s is a hard problem. E.g. combinatorial explosion of possibilities. (So much the worse for today’s RL algorithms!) But I don’t think that relates to humans 1/3

Comments