Check out our new work on video-guided audio gen with a focus on fine-grained creative control! Done by @czyang.bsky.social during an internship with our group at Adobe Research. Super fun model!
Reposted from Ziyang Chen
🎥 Introducing MultiFoley, a video-aware audio generation method with multimodal controls! 🔊
We can
⌨️Make a typewriter sound like a piano 🎹
🐱Make a cat meow like a lion roars! 🦁
⏱️Perfectly time existing SFX 💥 to a video.

arXiv: arxiv.org/abs/2411.17698
website: ificl.github.io/MultiFoley/

Comments