How well can neural networks generalize from how little data? New work with Emmanuel Chemla and Roni Katzir: Benchmark: github.com/taucompling/... Paper: aclanthology.org/2023.clasp-1... 🧵 - ThreadSky

About ThreadSky

nurikolan.bsky.social • 514 days ago

How well can neural networks generalize from how little data?

New work with Emmanuel Chemla and Roni Katzir:

Benchmark:
https://github.com/taucompling/bliss

Paper:
https://aclanthology.org/2023.clasp-1.15/

🧵

Comments

nurikolan.bsky.social•514 days ago

Grammar induction (GI) involves learning a formal grammar from a finite, often small, sample of a typically infinite language. To do this, a model must be able to generalize well.

Humans do this remarkably well based on very little data. What about neural nets?

nurikolan.bsky.social•514 days ago

We introduce BLISS - a Benchmark for Language Induction from Small Sets.

The benchmark assigns a generalization index to a model based on how much it generalizes from how little training data.

The initial release includes languages such as aⁿbⁿ, aⁿbᵐcⁿ⁺ᵐ, and Dyck 1-2.

nurikolan.bsky.social•514 days ago

Why a new benchmark?

A long line of work tested GI in different ways.

Many showed nets generalizing to some extent beyond training, but usually did not explain why generalization stopped at arbitrary points – why would a net get a¹⁰¹⁷b¹⁰¹⁷ right but a¹⁰¹⁸b¹⁰¹⁸ wrong?

nurikolan.bsky.social•514 days ago

We find that minimizing the algorithmic complexity of the net (w/ MDL) results in better generalization, using significantly less data.

The second-best net, a Memoy-Augmented RNN by Suzgun et al., shows that expressive power is important for GI, but isn't enough for little data.

Posting Rules

Be respectful to others
No spam or self-promotion
Stay on topic
Follow Bluesky's terms of service

Comments

Posting Rules

Reply