Switti -- a new scale-wise transformer for text-to-image generation 🦾

🔥 Improved generation of fine-grained details. Outperforms existing T2I AR models and competes with state-of-the-art T2I diffusion models while being up to 7x faster.

Comments