OpenAI and Paradigm's EVMbench: AI agents exploit 72% of smart contract vulnerabilities
OpenAI and crypto firm Paradigm built EVMbench, a benchmark measuring how well AI agents find, fix, and exploit Ethereum smart contract vulnerabilities across 120 bugs from 40 real-world audits. GPT-5.3-Codex exploited 72% of vulnerabilities; Claude Opus 4.6 led detection at 45.6%.
The biggest challenge isn't exploitation but discovery: when agents received hints about vulnerability locations, exploit rates jumped from 63% to 96% and fix rates from 39% to 94%. With over $100B locked in smart contracts, the researchers see both security opportunity and growing risk.
View full digest for February 20, 2026