My last-standing personal litmus test for new AI models that's easy to judge: create a novel joke w/ some random elements to incorporate so it doesn't repeat existing ones.
The reasoning models certainly make a better attempt at it but still come up way short, doesn't seem like it's far off though.
The reasoning models certainly make a better attempt at it but still come up way short, doesn't seem like it's far off though.
Comments