Thanks Kevin!
And thanks to my collaborators @alexkoulakov.bsky.social
Sergey Shuvaev and Divyansha Lachi !
(S & D -- have you joined the party here yet?)
And thanks to my collaborators @alexkoulakov.bsky.social
Sergey Shuvaev and Divyansha Lachi !
(S & D -- have you joined the party here yet?)
Reposted from
Kevin Mitchell
A wonderful paper from @tonyzador.bsky.social and colleagues: Encoding innate ability through a genomic bottleneck www.pnas.org/doi/10.1073/...
Comments
See e.g. https://aclanthology.org/2024.acl-long.713/ and https://doi.org/10.1162/tacl_a_00489
and very similar work by Gaier & Ha 2019 http://arxiv.org/abs/1906.04358
Interesting question...i suspect not, bcs i think we are finding very different solutions whereas reducing precision finds slightly perturbed solutions (i think)
Re: precision I meant that there's a chance that the "compressed" network is in practice not a bottleneck if it can store the same/approx. solution as the larger net in high precision weights.
Would be interesting to see how it does when quantized (either at test/train)
also i think quant is finding a solution very near W* (original trained network) whereas our compression goes far from W*
Apologies for not having been aware (or cited it)...our first draft of this paper was posted in 2020 and it took us 4 years to finally publish
(we do cite Gaier & Ha, 2019--ref #38).
In your framework, does random connectivity have long/maximal genomic description length (a la Kolmogorov)? My guess would've been that one of the important 'rules' in wiring a nervous system, or pretrained ANN, is to wire (somewhat) randomly (e.g., in the olfactory system).
https://www.cell.com/cell/fulltext/S0092-8674(22)01257-0
We explore this in a model with @tyrellturing.bsky.social et al
https://www.biorxiv.org/content/10.1101/2024.08.07.606541v1
Does it then make more sense to have the g-network output parameters of a prob. distr. from which the connectome is sampled? In the current formulation, it seems like it would be forced to memorize a particular realization of random noise, giving overly pessimistic compression lengths?
The cell type version has stochasticity
Working on a version that actually follows a developmental program
Speaking of genomic bottlenecks, Beamrider is the first computer game I ever played, on my father’s MSX computer