geronimo-ai.bsky.social - Profile | ThreadSky | a Reddit-style client for Bluesky

geronimo-ai.bsky.social

i like neural networks blog: https://medium.com/@geronimo7 pet project: https://snapfiddle.ai - Image editor with inpainting

541 posts 885 followers 2,312 following

Posts 36 Comments 14

Good friend of mine started a podcast. If you're into sports, triathlon, overcoming serious setbacks, following your dreams - this is for you www.youtube.com/watch?v=rj0k...

submitted 26 days ago • 0 comments

Creator of Stable Diffusion casually talking about his latest adventure: FLUX www.youtube.com/watch?v=nrKK...

submitted 47 days ago • 0 comments

No training of a text-to-image model without text. Here's my latest blog post on how to caption large datasets with SmolVLM2, Moondream2, and Qwen 2.5 VL medium.com/@geronimo7/i...

submitted 49 days ago • 1 comment

i'm working on a mini diffusion model. from scratch. trained on imagenet. it seems to struggle with the anatomy of airplanes. dogs are easy. why

submitted 51 days ago • 0 comments

Deepseek released Flash MLA code github.com/deepseek-ai/...

submitted 66 days ago • 0 comments

github.com/openai/SWELa...

submitted 72 days ago • 0 comments

Revealing Hidden Generative Capabilities in Discriminative Models github.com/stanislavfor... arxiv.org/pdf/2502.07753

submitted 73 days ago • 0 comments

i've built an object remover app that runs entirely in the browser! All of the data stays on your computer demo: next-lama.vercel.app

submitted 80 days ago • 1 comment

Deep Dive into LLMs like ChatGPT by @karpathy.bsky.social www.youtube.com/watch?v=7xTG...

submitted 83 days ago • 0 comments

huggingface.co/mistralai/Mi...

submitted 91 days ago • 0 comments

Love HF 🤗 Started an effort to reproduce R1 github.com/huggingface/...

submitted 96 days ago • 0 comments

4-bit Sana released demo: svdquant.mit.edu github.com/NVlabs/Sana/...

submitted 97 days ago • 0 comments

HAN lab released v1.1 of Sana's DC-AE huggingface.co/mit-han-lab/...

submitted 97 days ago • 0 comments

submitted 99 days ago • 0 comments

FluxEdit github.com/sayakpaul/fl...

submitted 100 days ago • 0 comments

holy sh, deepseek just delivered github.com/deepseek-ai/...

submitted 100 days ago • 0 comments

Flux-dev ControlNet Model huggingface.co/sayakpaul/ed...

submitted 103 days ago • 1 comment

NVIDIA AceInstruct-72B a family of advanced SFT models for coding, mathematics, and general-purpose tasks research.nvidia.com/labs/adlr/ac... huggingface.co/nvidia/AceIn...

submitted 103 days ago • 0 comments

submitted 103 days ago • 0 comments

Musk Zuckerberg

submitted 104 days ago • 0 comments

FLUX Pro Finetuning API announced blackforestlabs.ai/announcing-t... $2-$6 per finetuned model

submitted 104 days ago • 0 comments

#psychology #perception #meme

submitted 106 days ago • 0 comments

transformer.js support for Kokoro! Kokoro is the #1 text-to-speech model on leaderboard: huggingface.co/spaces/Pendr... thanks to @xenova.bsky.social and transformers.js it now runs in the browser!! huggingface.co/onnx-communi...

submitted 106 days ago • 0 comments

Tensor Product Attention Is All You Need "a novel attention mechanism that uses tensor decompositions to represent queries, keys, and values compactly, significantly shrinking KV cache size at inference time" code: github.com/tensorgi/T6 arxiv.org/abs/2501.06425

submitted 107 days ago • 0 comments

"Titans" implementation WIP github.com/lucidrains/t...

submitted 107 days ago • 0 comments

Sana 4096x4096px + 1024x1025px center crop

submitted 108 days ago • 2 comments

submitted 108 days ago • 0 comments

Sana 4k generates pretty impressive images, same old struggle with human anatomy though.

submitted 110 days ago • 1 comment

Generate 16 megapixels of weird fruits with Sana 4k! Runs on 24GB VRAM, cuda and mps thanks to a recent PR in diffusers, takes ~40s/img on a RTX 4090. Code 👇 github.com/geronimi73/3...

submitted 110 days ago • 0 comments

Sana 4k released huggingface.co/Efficient-La...

submitted 112 days ago • 0 comments

LLM aha moment 🤯 after the last decoder block, the last token contains contains ALL the context and only this information is used to generate the next token www.reddit.com/r/learnmachi...

submitted 112 days ago • 0 comments

Deep Lourning Course by Han lab (MIT) lecture videos + slides hanlab.mit.edu/courses/2024...

submitted 113 days ago • 0 comments

Key-value memory is an important concept in modern machine learning (e.g., transformers). Ila Fiete, Kazuki Irie, and I have written a paper showing how key-value memory provides a way of thinking about memory organization in the brain: arxiv.org/abs/2501.02950

submitted 114 days ago • 3 comments

X is finished

submitted 115 days ago • 1 comment

Wrote Part 2 of my now ankle-deep dive into NVIDIA's Sana, this time looking at its Diffusion Transformer component: medium.com/@geronimo7/u...

submitted 115 days ago • 0 comments

PdfItDown: everything -> pdf built on top of markitdown

submitted 117 days ago • 0 comments