How well can neural networks generalize from how little data?
New work with Emmanuel Chemla and Roni Katzir:
Benchmark:
https://github.com/taucompling/bliss
Paper:
https://aclanthology.org/2023.clasp-1.15/
🧵
New work with Emmanuel Chemla and Roni Katzir:
Benchmark:
https://github.com/taucompling/bliss
Paper:
https://aclanthology.org/2023.clasp-1.15/
🧵
Comments
Humans do this remarkably well based on very little data. What about neural nets?
The benchmark assigns a generalization index to a model based on how much it generalizes from how little training data.
The initial release includes languages such as aⁿbⁿ, aⁿbᵐcⁿ⁺ᵐ, and Dyck 1-2.
A long line of work tested GI in different ways.
Many showed nets generalizing to some extent beyond training, but usually did not explain why generalization stopped at arbitrary points – why would a net get a¹⁰¹⁷b¹⁰¹⁷ right but a¹⁰¹⁸b¹⁰¹⁸ wrong?
The second-best net, a Memoy-Augmented RNN by Suzgun et al., shows that expressive power is important for GI, but isn't enough for little data.