Profile avatar
stefanlattner.bsky.social
Research Leader @ Sony CSL Paris
13 posts 55 followers 41 following
Getting Started
comment in response to post
We also show that our IC estimates can help predict EEG measurements. πŸ’†β€β™€οΈ Surprisal can be used for segment boundary detection and to simulate the information processing of a listener. 🎢 🧠 πŸ“œ Link to the paper: arxiv.org/pdf/2501.07474 Model weights are soon to come! πŸ‹οΈ πŸ’«βœ¨ #SonyCSLMusic πŸ’«βœ¨
comment in response to post
3/ Results show: - Higher fidelity (FAD ↓ by 20%) - Better adherence to text & audio prompts (APA ↑) - Faster generation with 5-step inference! AI-assisted music production. πŸŽΌπŸ’‘ Let us know your thoughts! Congrats to the authors Javier Nistal and Marco Pasini! #AI #MusicGeneration #Transformers
comment in response to post
2/ 🎀 What’s new? - Stereo output with superior fidelity - Bridging the gap in Text-to-audio CLAP embeddings πŸ“πŸŽ΅ - Faster inference using a consistency framework ⚑ Audio examples: sonycslparis.github.io/improved_dar/ πŸŽΆπŸ‘‚
comment in response to post
1/ Building on Diff-A-Riff, we’ve upgraded to a stereo-capable autoencoder & replaced the U-Net with a Diffusion Transformer (DiT) to improve quality, diversity, and control. πŸŽ§πŸ“ˆ Plus, our model generates high-quality audio with fewer denoising steps. πŸš€
comment in response to post
Hybrid Losses for Hierarchical Embedding Learning H. Tian, S. Lattner, B. McFee, C. Saitis Congrats to the authors!
comment in response to post
Music2Latent2: Audio Compression with Summary Embeddings and Autoregressive Decoding M. Pasini, S. Lattner, G. Fazekas Zero-shot Musical Stem Retrieval with Joint-Embedding Predictive Architectures A. Riou, S. Lattner, A. GagnerΓ©, G. Hadjeres, S. Lattner, G. Peeters