Our study on 35k RNNs reveals that the selection of the mechanism of short-term memory (ie slow-point manifold or limit cycles) depends on:
🤖 the task structure and
🤖 the learning rate of the RNN.

We further derive scaling laws for how long RNNs can store info before failing.

Comments