Individual benchmark scores plotted by date.
| Organisation | Model | Reported | Top Score | Info | Self Reported | Source |
|---|---|---|---|---|---|---|
| Claude Opus 4.7 | 16 Apr 2026 | 14400 | xHigh thinking; path length avg; difficulty-weighted 5122.5986; max 19112; avg validity 99.70%; 50 starting words | No | Source | |
| GPT 5.5 | 23 Apr 2026 | 9845 | Medium thinking; path length avg; difficulty-weighted 3315.5211; max 15635; avg validity 99.45%; 50 starting words | No | Source | |
| Gemini 3.1 Pro Preview | 19 Feb 2026 | 6445 | High thinking; path length avg; difficulty-weighted 1929.1071; max 11728; avg validity 95.98%; 50 starting words | No | Source | |
| o3 | 16 Apr 2025 | 1036 | - | No | Source | |
| o1 | 17 Dec 2024 | 338 | - | No | Source | |
| o4 Mini | 16 Apr 2025 | 337 | High Reasoning Effort | No | Source | |
| o3 mini | 30 Jan 2025 | 324 | - | No | Source | |
| Grok 3 Beta | 19 Feb 2025 | 231 | - | No | Source | |
| GPT 4.5 | 27 Feb 2025 | 220 | - | No | Source | |
| Claude 3.5 Sonnet (2024-10-22) | 22 Oct 2024 | 167 | - | No | Source | |
| Claude 3.7 Sonnet | 24 Feb 2025 | 166 | - | No | Source | |
| Claude Sonnet 4 | 21 May 2025 | 99 | - | No | Source | |
| Claude 3 Opus | 04 Mar 2024 | 48 | - | No | Source |