🚀 Exciting news! Our paper "Learning to Discretize Diffusion ODEs" has been accepted as an Oral at #ICLR2025! 🎉
[1/n]
We propose LD3, a lightweight framework that learns the optimal time discretization for sampling from pre-trained Diffusion Probabilistic Models (DPMs).
[1/n]
We propose LD3, a lightweight framework that learns the optimal time discretization for sampling from pre-trained Diffusion Probabilistic Models (DPMs).
Comments
Diffusion models produce high-quality generations but are computationally expensive due to multi-step sampling. Existing acceleration methods either require costly retraining (distillation) or depend on manually designed time discretization heuristics. LD3 changes that.
LD3 optimizes the time discretization for diffusion ODE solvers by minimizing the global truncation error, resulting in higher sample quality with fewer sampling steps.
LD3 uses a teacher-student framework:
🔹Teacher: Runs the ODE solver with small step sizes.
🔹Student: Learns optimal discretization to match the teacher's output.
🔹Backpropagates through the ODE solver to refine time steps.
A potential problem with the student model is its limited capacity. To address this, we propose a soft surrogate loss, simplifying the student's optimization task.
This surrogate loss is theoretically close to the original distillation objective, leading to better convergence and avoiding underfitting.
LD3 significantly improves sample quality.