First Proof: 11 mathematicians build AI math test with answers that never existed online
Mathematicians from Stanford, Harvard, Columbia, Yale, and EPFL created First Proof, a benchmark of 10 research-level problems drawn from their own unpublished work spanning algebraic combinatorics, spectral graph theory, symplectic geometry, and more. Answers were encrypted and posted to 1stproof.org.
GPT-5.2 Pro and Gemini 3.0 DeepThink struggled on most problems under a strict one-shot protocol. Because the solutions have never appeared online, data contamination is impossible.
View full digest for February 17, 2026