Profile avatar
kolesnikov.ch
Member of technical staff at OpenAI (Zurich).
20 posts 713 followers 71 following
Getting Started
Conversation Starter

Looking for a small or medium sized VLM? PaliGemma 2 spans more than 150x of compute! Not sure yet if you want to invest the time 🪄finetuning🪄 on your data? Give it a try with our ready-to-use "mix" checkpoints: 🤗 huggingface.co/blog/paligem... 🎤 developers.googleblog.com/en/introduci...

With some delay, JetFormer's *prequel* paper is finally out on arXiv: a radically simple ViT-based normalizing flow (NF) model that achieves SOTA results in its class. Jet is one of the key components of JetFormer, deserving a standalone report. Let's unpack: 🧵⬇️

Paligemma2 is out! Bigger models, better results. For the best experience, do not forget to finetune. Congrats Paligemma2 team!

Ok, it is yesterdays news already, but good night sleep is important. After 7 amazing years at Google Brain/DM, I am joining OpenAI. Together with @xzhai.bsky.social and @giffmana.ai, we will establish OpenAI Zurich office. Proud of our past work and looking forward to the future.

In arxiv.org/abs/2303.00848, @dpkingma.bsky.social and @ruiqigao.bsky.social had suggested that noise augmentation could be used to make other likelihood-based models optimise perceptually weighted losses, like diffusion models do. So cool to see this working well in practice!

The answer has just dropped: bsky.app/profile/kole...

I always dreamed of a model that simultaneously 1. optimizes NLL of raw pixel data, 2. generates competitive high-res. natural images, 3. is practical. But it seemed too good to be true. Until today! Our new JetFormer model (arxiv.org/abs/2411.19722) ticks on all of these. 🧵