stefanlattner.bsky.social
Research Leader @ Sony CSL Paris
13 posts
55 followers
41 following
Getting Started
comment in response to
post
We also show that our IC estimates can help predict EEG measurements. πββοΈ
Surprisal can be used for segment boundary detection and to simulate the information processing of a listener. πΆ π§
π Link to the paper: arxiv.org/pdf/2501.07474
Model weights are soon to come! ποΈ
π«β¨ #SonyCSLMusic π«β¨
comment in response to
post
3/ Results show:
- Higher fidelity (FAD β by 20%)
- Better adherence to text & audio prompts (APA β)
- Faster generation with 5-step inference!
AI-assisted music production. πΌπ‘ Let us know your thoughts!
Congrats to the authors Javier Nistal and Marco Pasini!
#AI #MusicGeneration #Transformers
comment in response to
post
2/ π€ Whatβs new?
- Stereo output with superior fidelity
- Bridging the gap in Text-to-audio CLAP embeddings ππ΅
- Faster inference using a consistency framework β‘
Audio examples: sonycslparis.github.io/improved_dar/ πΆπ
comment in response to
post
1/ Building on Diff-A-Riff, weβve upgraded to a stereo-capable autoencoder & replaced the U-Net with a Diffusion Transformer (DiT) to improve quality, diversity, and control. π§π Plus, our model generates high-quality audio with fewer denoising steps. π
comment in response to
post
Hybrid Losses for Hierarchical Embedding Learning
H. Tian, S. Lattner, B. McFee, C. Saitis
Congrats to the authors!
comment in response to
post
Music2Latent2: Audio Compression with Summary Embeddings and Autoregressive Decoding
M. Pasini, S. Lattner, G. Fazekas
Zero-shot Musical Stem Retrieval with Joint-Embedding Predictive Architectures
A. Riou, S. Lattner, A. GagnerΓ©, G. Hadjeres, S. Lattner, G. Peeters