Loading
AI Stats is fetching the latest data for this page. This usually only takes a moment.
If this screen doesn't disappear after a short while, you can refresh the page or use one of the links above to continue.
Individual benchmark scores plotted by date.
| Organisation | Model | Reported | Top Score | Info | Self Reported | Source |
|---|---|---|---|---|---|---|
| o3 Preview | 20 Dec 2024 | 0.97 | - | Yes | Source | |
| GPT OSS 120b | 05 Aug 2025 | 0.97 | High Reasoning Effort, With Tools | Yes | Source | |
| GPT OSS 20b | 05 Aug 2025 | 0.96 | High Reasoning Effort, With Tools | Yes | Source | |
| Grok 3 Mini Beta | 19 Feb 2025 | 0.96 | Think, Cons@64 | Yes | Source | |
| Grok 3 Beta | 19 Feb 2025 | 0.96 | Reasoning | Yes | Source | |
| o4 Mini | 16 Apr 2025 | 0.93 | - | Yes | Source | |
| o3 Pro | 10 Jun 2025 | 0.93 | - | Yes | Source | |
| o3 | 16 Apr 2025 | 0.92 | - | Yes | Source | |
| Deepseek R1 (2025-05-28) | 28 May 2025 | 0.91 | - | Yes | Source | |
| Grok 3 Mini | 18 Apr 2025 | 0.91 | High Reasoning Effort | Yes | Source | |
| Gemini 2.5 Flash Preview (2025-04-17) | 17 Apr 2025 | 0.88 | Thinking, Single Attempt | Yes | Source | |
| o3 mini | 30 Jan 2025 | 0.87 | - | Yes | Source | |
| o1 pro | 19 Mar 2025 | 0.86 | - | Yes | Source | |
| Qwen3 235B A22B | 29 Apr 2025 | 0.86 | - | Yes | Source | |
| Qwen3 32B | 29 Apr 2025 | 0.81 | - | Yes | Source | |
| Phi 4 Reasoning Plus | 30 Apr 2025 | 0.81 | - | Yes | Source | |
| Granite 3.3 8B Instruct | 16 Apr 2025 | 0.81 | - | Yes | Source | |
| Qwen3 30B A3B | 29 Apr 2025 | 0.80 | - | Yes | Source | |
| Claude 3.7 Sonnet | 24 Feb 2025 | 0.80 | 64k Thinking | Yes | Source | |
| Deepseek R1 (2025-01-20) | 20 Jan 2025 | 0.80 | - | No | Source | |
| QwQ 32B | 05 Mar 2025 | 0.80 | - | Yes | Source | |
| Kimi k1.5 | 20 Jan 2025 | 0.78 | - | No | Source | |
| Phi 4 Reasoning | 30 Apr 2025 | 0.75 | - | Yes | Source | |
| o1 | 17 Dec 2024 | 0.74 | - | Yes | Source | |
| Magistral Medium | 10 Jun 2025 | 0.74 | - | Yes | Source | |
| Magistral Small | 10 Jun 2025 | 0.71 | Pass@1 | Yes | Source | |
| Kimi K2 Instruct | 11 Jul 2025 | 0.70 | Avg@64 | Yes | Source | |
| Grok 3 | 18 Apr 2025 | 0.60 | - | Yes | Source | |
| DeepSeek V3 (2025-03-24) | 25 Mar 2025 | 0.59 | - | Yes | Source | |
| Phi 4 Mini Reasoning | 30 Apr 2025 | 0.57 | - | Yes | Source | |
| QwQ 32B Preview | 28 Nov 2024 | 0.50 | - | Yes | Source | |
| GPT 4.1 Mini | 14 Apr 2025 | 0.50 | - | Yes | Source | |
| GPT 4.1 | 14 Apr 2025 | 0.48 | - | Yes | Source | |
| o1 preview | 12 Sept 2024 | 0.42 | - | Yes | Source | |
| DeepSeek V3 (2024-12-26) | 25 Dec 2024 | 0.39 | - | No | Source | |
| GPT 4.5 | 27 Feb 2025 | 0.37 | - | Yes | Source | |
| Gemini 2.0 Flash | 05 Feb 2025 | 0.32 | Single Attempt | Yes | Source | |
| GPT 4.1 Nano | 14 Apr 2025 | 0.29 | - | Yes | Source | |
| Claude 3.5 Sonnet (2024-10-22) | 22 Oct 2024 | 0.16 | - | Yes | - | |
| GPT 4o | 06 Aug 2024 | 0.13 | - | Yes | Source |