Found 2 big issues with Gemini's structured outputs (SO):

1. Using constrained decoding seems to lower performance in reasoning tasks.
2. The Generative AI SDK can break your model's reasoning.

Just re-ran Let Me Speak Freely benchmarks with Gemini and got some interesting news

Comments