nicolaskeriven.bsky.social
CNRS researcher. Hates genAI with passion. Food and music 🎹 https://linktr.ee/bluecurlmusic
66 posts
235 followers
57 following
Prolific Poster
Conversation Starter
comment in response to
post
Colon, question mark, ticking all the boxes
comment in response to
post
Also note that we limit ourselves to vanilla GNNs: no skip connections, normalization, etc classical anti-oversmoothing strategies. That's for the future!
7/7
comment in response to
post
Note that, while many optimization results are adapted from MLP to GNNs, here this is *specific* to GNNs: we show that this does not happen for MLPs.
6/7
comment in response to
post
...and that's bad! As soon as the last layer is trained (that happens immediately!), this average vanishes. The entire GNN is quickly stuck in a terrible near-critical point!
(*at every layers*, due to further mechanisms).
That's the main result of this paper.
5/7
comment in response to
post
...the "oversmoothing" of the backward signal is very particular: *when the forward is oversmoothed*, the updates to the backward are almost "linear-GNN" with no non-linearity.
So we *can* compute its oversmoothed limit! It is the **average** of the prediction error...
4/7
comment in response to
post
... but at the *middle* layers! Due to the non-linearity, it oversmoothes *when the forward is also oversmoothed* 😲
Why is it bad for training? Well...
3/7
comment in response to
post
The gradients are formed of two parts: the forward signal, and the backward "signal", initialized at the prediction error at the output then backpropagated.
Like the forward, the backward is multiplied by the graph matrix, so it will oversmooth...
2/7
comment in response to
post
bsky.app/profile/nico...
comment in response to
post
Hyper simple, j'ouvre un bouquin d'info spécialité hardware, je construis un ordi, je vais sur wikipédia
comment in response to
post
Configurer eduroam sur le hub de ma chatière connectée