I can finally share our draft about: a) the first method for proving lower bounds against one-layer Transformers with *infinite* precision and b) a new scalable attention mechanism for tracking higher-order interactions between tokens.... (more info in the 🧵) #MLsky #ML #AI #CompSky 🧪

Comments