Loading
AI Stats is fetching the latest data for this page. This usually only takes a moment.
If this screen doesn't disappear after a short while, you can refresh the page or use one of the links above to continue.
Individual benchmark scores plotted by date.
| Organisation | Model | Reported | Top Score | Info | Self Reported | Source |
|---|---|---|---|---|---|---|
| GPT 5 | 07 Aug 2025 | 0.88 | With Thinking, Pass @ 1, Diff Method | Yes | Source | |
| o3 Pro | 10 Jun 2025 | 0.85 | High Reasoning Effort | No | Source | |
| Gemini 2.5 Pro Preview (2025-06-05) | 05 Jun 2025 | 0.82 | Diff-Fenced | No | Source | |
| o3 | 16 Apr 2025 | 0.81 | High Reasoning Effort | No | Source | |
| Grok 4 | 10 Jul 2025 | 0.80 | Diff | No | Source | |
| Gemini 2.5 Pro Preview (2025-05-06) | 06 May 2025 | 0.77 | Whole | Yes | Source | |
| Claude Opus 4 | 21 May 2025 | 0.72 | 32k Thinking | No | Source | |
| o4 Mini | 16 Apr 2025 | 0.72 | High Reasoning Effort | No | Source | |
| GPT 5 mini | 07 Aug 2025 | 0.72 | High Reasoning Effort, Diff Method | Yes | Source | |
| Claude 3.7 Sonnet | 24 Feb 2025 | 0.65 | 32k Thinking | No | Source | |
| Gemini 2.5 Flash Preview (2025-05-20) | 20 May 2025 | 0.62 | Whole | Yes | Source | |
| Qwen3 Coder 480B A35B Instruct | 22 Jul 2025 | 0.62 | - | Yes | Source | |
| o1 | 17 Dec 2024 | 0.62 | - | No | Source | |
| Claude Sonnet 4 | 21 May 2025 | 0.61 | 32k Thinking | No | Source | |
| o3 mini | 30 Jan 2025 | 0.60 | High Reasoning Effort | No | Source | |
| Kimi K2 Instruct | 11 Jul 2025 | 0.60 | Acc | Yes | Source | |
| Qwen3 A235 A22B Instruct 2507 | 21 Jul 2025 | 0.57 | - | Yes | Source | |
| Deepseek R1 (2025-01-20) | 20 Jan 2025 | 0.57 | - | No | Source | |
| DeepSeek V3 (2025-03-24) | 25 Mar 2025 | 0.55 | - | No | Source | |
| Grok 3 Beta | 19 Feb 2025 | 0.53 | - | No | Source | |
| GPT 4.1 | 14 Apr 2025 | 0.52 | - | No | Source | |
| Claude 3.5 Sonnet (2024-10-22) | 22 Oct 2024 | 0.52 | - | No | Source | |
| Gemini 2.5 Flash Preview (2025-04-17) | 17 Apr 2025 | 0.51 | Thinking, Whole | Yes | Source | |
| GPT 5 nano | 07 Aug 2025 | 0.48 | High Reasoning Effort, Diff Method | Yes | Source | |
| Magistral Medium | 10 Jun 2025 | 0.47 | - | Yes | Source | |
| GPT 4.5 | 27 Feb 2025 | 0.45 | - | No | Source | |
| GPT OSS 120b | 05 Aug 2025 | 0.44 | High Reasoning Effort | Yes | Source | |
| GPT OSS 20b | 05 Aug 2025 | 0.34 | High Reasoning Effort | Yes | Source | |
| o1 mini | 12 Sept 2024 | 0.33 | - | No | Source | |
| GPT 4.1 Mini | 14 Apr 2025 | 0.32 | - | No | Source | |
| Claude 3.5 Haiku | 04 Nov 2024 | 0.28 | - | No | Source | |
| Gemini 2.5 Flash Lite Preview | 17 Jun 2025 | 0.27 | Thinking | Yes | Source | |
| Gemini 2.0 Flash | 05 Feb 2025 | 0.22 | Whole | Yes | Source | |
| GPT 4.1 Nano | 14 Apr 2025 | 0.09 | - | No | Source |