Profile avatar
yigit.ai
Research Scientist at Google. PhD from ETH Zürich. Exotic AI architectures and silicon. 👾 Zürich, Switzerland
42 posts 419 followers 550 following
Regular Contributor
Active Commenter

Making LLMs run efficiently can feel scary, but scaling isn’t magic, it’s math! We wanted to demystify the “systems view” of LLMs and wrote a little textbook called “How To Scale Your Model” which we’re releasing today. 1/n

Linux running in a PDF file: linux.doompdf.dev/linux.pdf

If you’re passionate about brain-inspired algorithms/hardware and novel neural computation beyond current TPU/GPU stack please apply to the CapoCaccia Workshops for Neuromorphic Intelligence. It values creativity, exploration and interdisciplinary collaboration🧪 capocaccia.cc/en/public/at...

true but measuring compute is more fun.

Live ISS telemetry is interesting to watch. You can monitor critical sensors i.e, airlock or cabin pressures, or the urine tank percentage :)

I didn't properly practice winter sports during my 5-year PhD in Switzerland. This morning, I'm on the SBB train to LAAX to learn snowboarding in 5 days.

1/ Okay, one thing that has been revealed to me from the replies to this is that many people don't know (or refuse to recognize) the following fact: The unts in ANN are actually not a terrible approximation of how real neurons work! A tiny 🧵. 🧠📈 #NeuroAI #MLSky

Nobel lecture in economic sciences from Daren Acemoglu is about to start

Today is Gemini's 1st birthday 🎂, and the new experimental model, gemini-exp-1206 is #1 across the board in LMSYS Chatbot Arena.

if you see this, post a knight

ASML in Europe builds one of the most complex and precious engineering artifact, EUV lithography machines, and sit at the root of modern tech tree.

Why does #compneuro need new learning methods? ANN models are usually trained with Gradient Descent (GD), which violates biological realities like Dale’s law and log-normal weights. Here we describe a superior learning algorithm for comp neuro: Exponentiated Gradients (EG)! 1/12 #neuroscience 🧪

My posts are to be consumed by neural nets anyway. Good job @hf.co .

My morning routine now includes practicing latte art with my flat white at the Google MKs.

A good loss landscape: Switzerland 🇨🇭

After a PhD with Python, returning to C++ for a good reason :)

Google Research Zürich is a magical place quite like Hogwarts. Every wizard I meet works on powerful spells and potions.

New toy has arrived!

Quite a candy for my optimization appetite :) As majority of inference are still on CPUs on mobile/edge accelerators, Mojo will be interesting to watch closely. I wonder how well it will support Triton or CUDA https://youtu.be/6GvB5lZJqcE

@vedatmilor.bsky.social Hosgeldiniz Vedat Bey :)

I release a minimal (<150 lines) JAX implementation of "Gradients without Backpropagation" paper. It proposes a simple addition to forward AD to estimate unbiased gradients during single inference pass (quick project, might be further optimized) https://github.com/YigitDemirag/forward-gradients

Auto-GPT (74.6k) has more GitHub stars than PyTorch (65.4k) 🧐🧐

👻 böö!