Profile avatar
dlsq.bsky.social
AI Scientist at Mistral AI. Past: Google DeepMind. πŸ‡§πŸ‡· in πŸ‡¬πŸ‡§
19 posts 2,201 followers 274 following
Regular Contributor
Conversation Starter

We've upgraded Le Chat and it's blazing fast right now! Also available for Android and iOS as of today mistral.ai/en/news/all-...

We're releasing Mistral Small 3! - 24B params, 81% MMLU - Latency optimized: 150 tokens/s - Competitive with Llama-3.3 70B, Qwen-2.5 32B, GPT4o-mini - Apache 2.0 mistral.ai/news/mistral...

What people are going to do with AGI

agent swarm framework aces spatial reasoning test

Inventors of flow matching have released a comprehensive guide going over the math & code of flow matching! Also covers variants like non-Euclidean & discrete flow matching. A PyTorch library is also released with this guide! This looks like a very good read! πŸ”₯ arxiv: arxiv.org/abs/2412.06264

Jane Street, a quant trading firm has a very good YouTube channel. For comparison, DeepSeek is also a quant trading firm. They recently published a video on "Building Machine Learning Systems for a Trillion Trillion Floating Point Operations". Link: www.youtube.com/watch?v=139U...

AI Scientists: here is a technology that will automate your grunt work so you can spend more time with your kids AI Ads: here is a technology that will automate spending time with your kids

A dataset of 1 million or 2 million Bluesky posts is completely irrelevant to training large language models. The primary usecase for the datasets that people are losing their shit over isn't ChatGPT, it's social science research and developing systems that improve Bluesky.

Arxiv sharing reminder pdf ❌ abs βœ…

READ: β€œ3,337 Parisians were equipped with GPS trackers to record their journeys…for journeys from the outskirts of Paris to the center, the number of cyclists now far exceeds the number of motorists, a huge change from just 5 years ago.” Evidence of leadership. www.forbes.com/sites/carlto...

We have 2 new big updates today at Mistral: - New Le Chat: With canvas, web search, image understanding and generation & more - and free! - Pixtral Large, our Frontier 124B open weight multimodal model that powers it. Try it: chat.mistral.ai Blog post: mistral.ai/news/mistral...

There seems to be some renewed interest in making this work in the ML/AI space, so I'm here as well πŸ‘‹ Here's my latest blog post for good measure, about how diffusion models of images perform autoregression in frequency space: sander.ai/2024/09/02/s... When I write more, I'll share here as well!

Quick thread in response to a question on token packing practices when pretraining LLMs!

Hey all, thanks for the follow! Just FYI, we're hiring AI Scientists and Engineers at Mistral AI. If you're driven and interested in building cutting-edge GenAI, we'd love to have you join our team. 🌐 Check out our openings: jobs.lever.co/mistral #AIJobs #TechCareers #MistralAI

Tencent's Hunyuan-Large The largest open-source Transformer-based MoE model with 389 billion parameters, can handle up to 256K tokens. Key features include large-scale synthetic data and a mixed expert routing strategy. Model: huggingface.co/tencent/Tenc... Paper: arxiv.org/abs/2411.02265

Thinking how much Google struggled to get TPUs to work well makes me skeptical that Nvidia is in any danger of losing dominance, at least for now.

I try to stay up to date with Gen AI videos, but you can do camera control now? Seriously? Runway introduces advanced camera control for Gen-3 Alpha Turbo. Choose both the direction and intensity of how you move through your scenes for even more intention in every shot.

#Tokenization is undeniably a key player in the success story of #LLMs but we poorly understand why. I want to highlight progress we made in understanding the role of tokenization, developing the core incidents and mitigating its problems. πŸ§΅πŸ‘‡

New starter pack! go.bsky.app/GZ4hZzu

Instead of blanket-following everyone from a starter pack I'm looking at the Posts tab to check for activity/quality, let's see how it goes.

Great way of bootstrapping your Tech/Sci follows