In the paper "Relations, Negations, and Numbers: Looking for Logic in Generative Text-to-Image Models", explore how DALL-E 3 handles logical structure in generated scenes. They conclude that image generators need explicit mechanisms to represent relationships.
1 / 2
Comments