OpenAI had the answers to FrontierMath, which brings into question their o3 results A lot of people think they didn’t actually train on the test set, although admit that there’s still plenty of contamination potential - ThreadSky

timkellogg.me • 36 days ago

OpenAI had the answers to FrontierMath, which brings into question their o3 results

A lot of people think they didn’t actually train on the test set, although admit that there’s still plenty of contamination potential