Back to AI Briefing
OpenAI News

Why we no longer evaluate SWE-bench Verified

Quick Summary

"SWE-bench Verified is increasingly contaminated and mismeasures frontier coding progress. Our analysis shows flawed tests and training leakage. We recommend SWE-bench Pro."

This article was originally published by OpenAI News. You can read the full, in-depth story at the source below.

Read Full Story at OpenAI News

Stay updated with the latest in AI by subscribing to our newsletter below.