yigit.ai - Profile | ThreadSky | a Reddit-style client for Bluesky

Making LLMs run efficiently can feel scary, but scaling isn’t magic, it’s math! We wanted to demystify the “systems view” of LLMs and wrote a little textbook called “How To Scale Your Model” which we’re releasing today. 1/n

submitted 22 days ago • 3 comments

Linux running in a PDF file: linux.doompdf.dev/linux.pdf

submitted 21 days ago • 0 comments

If you’re passionate about brain-inspired algorithms/hardware and novel neural computation beyond current TPU/GPU stack please apply to the CapoCaccia Workshops for Neuromorphic Intelligence. It values creativity, exploration and interdisciplinary collaboration🧪 capocaccia.cc/en/public/at...

submitted 49 days ago • 0 comments

true but measuring compute is more fun.

submitted 50 days ago • 1 comment

Live ISS telemetry is interesting to watch. You can monitor critical sensors i.e, airlock or cabin pressures, or the urine tank percentage :)

submitted 63 days ago • 0 comments

I didn't properly practice winter sports during my 5-year PhD in Switzerland. This morning, I'm on the SBB train to LAAX to learn snowboarding in 5 days.

submitted 64 days ago • 1 comment

1/ Okay, one thing that has been revealed to me from the replies to this is that many people don't know (or refuse to recognize) the following fact: The unts in ANN are actually not a terrible approximation of how real neurons work! A tiny 🧵. 🧠📈 #NeuroAI #MLSky

submitted 72 days ago • 21 comments

Nobel lecture in economic sciences from Daren Acemoglu is about to start

submitted 80 days ago • 0 comments

Today is Gemini's 1st birthday 🎂, and the new experimental model, gemini-exp-1206 is #1 across the board in LMSYS Chatbot Arena.

submitted 82 days ago • 0 comments

if you see this, post a knight

submitted 84 days ago • 0 comments

ASML in Europe builds one of the most complex and precious engineering artifact, EUV lithography machines, and sit at the root of modern tech tree.

submitted 88 days ago • 0 comments

Why does #compneuro need new learning methods? ANN models are usually trained with Gradient Descent (GD), which violates biological realities like Dale’s law and log-normal weights. Here we describe a superior learning algorithm for comp neuro: Exponentiated Gradients (EG)! 1/12 #neuroscience 🧪

submitted 121 days ago • 4 comments

My posts are to be consumed by neural nets anyway. Good job @hf.co .

submitted 90 days ago • 0 comments

My morning routine now includes practicing latte art with my flat white at the Google MKs.

submitted 91 days ago • 1 comment

A good loss landscape: Switzerland 🇨🇭

submitted 94 days ago • 0 comments

After a PhD with Python, returning to C++ for a good reason :)

submitted 94 days ago • 0 comments

Google Research Zürich is a magical place quite like Hogwarts. Every wizard I meet works on powerful spells and potions.

submitted 558 days ago • 0 comments

New toy has arrived!

submitted 656 days ago • 0 comments

Quite a candy for my optimization appetite :) As majority of inference are still on CPUs on mobile/edge accelerators, Mojo will be interesting to watch closely. I wonder how well it will support Triton or CUDA https://youtu.be/6GvB5lZJqcE

submitted 665 days ago • 0 comments

@vedatmilor.bsky.social Hosgeldiniz Vedat Bey :)

submitted 678 days ago • 0 comments

I release a minimal (<150 lines) JAX implementation of "Gradients without Backpropagation" paper. It proposes a simple addition to forward AD to estimate unbiased gradients during single inference pass (quick project, might be further optimized) https://github.com/YigitDemirag/forward-gradients