Long-audio understanding: Audio Flamingo 2 (AF2), using a custom CLAP model, synthetic data, and multi-stage curriculum learning, achieved state-of-the-art performance on over 20 benchmarks, including a new long-audio dataset (LongAudio).

Comments