Loading
AI Stats is fetching the latest data for this page. This usually only takes a moment.
If this screen doesn't disappear after a short while, you can refresh the page or use one of the links above to continue.
Individual benchmark scores plotted by date.
| Organisation | Model | Reported | Top Score | Info | Self Reported | Source |
|---|---|---|---|---|---|---|
| Gemini 3.0 Pro Preview | 18 Nov 2025 | 1.00 | With Code Execution | Yes | Source | |
| Grok 4 Heavy | 10 Jul 2025 | 1.00 | - | Yes | Source | |
| Claude Opus 4.5 | 24 Nov 2025 | 1.00 | Avg@5, 64k Thinking, With Tools | Yes | Source | |
| GPT 5 | 07 Aug 2025 | 1.00 | Thinking, With Python, Pass @ 1 | Yes | Source | |
| Grok 4 | 10 Jul 2025 | 0.99 | - | Yes | Source | |
| GPT OSS 20b | 05 Aug 2025 | 0.99 | High Reasoning Effort, With Tools | Yes | Source | |
| o3 | 16 Apr 2025 | 0.98 | - | Yes | Source | |
| GPT OSS 120b | 05 Aug 2025 | 0.98 | High Reasoning Effort, With Tools | Yes | Source | |
| Grok 3 Beta | 19 Feb 2025 | 0.93 | Think, Cons@64 | Yes | Source | |
| o4 Mini | 16 Apr 2025 | 0.93 | - | Yes | Source | |
| Qwen3 235B A22B Thinking 2507 | 25 Jul 2025 | 0.92 | - | Yes | Source | |
| GPT 5 mini | 07 Aug 2025 | 0.91 | High Reasoning Effort, No Tools | Yes | Source | |
| Grok 3 Mini Beta | 19 Feb 2025 | 0.91 | Think, Cons@64 | Yes | Source | |
| Gemini 2.5 Pro Preview (2025-06-05) | 05 Jun 2025 | 0.88 | Single Attempt | Yes | Source | |
| Deepseek R1 (2025-05-28) | 28 May 2025 | 0.85 | - | Yes | Source | |
| EXAONE 4.0 32B | 15 Jul 2025 | 0.85 | Reasoning | Yes | Source | |
| GPT 5 nano | 07 Aug 2025 | 0.85 | High Reasoning Effort, No Tools | Yes | Source | |
| Grok 3 Mini | 18 Apr 2025 | 0.83 | High Reasoning Effort | Yes | Source | |
| Gemini 2.5 Pro Preview (2025-05-06) | 06 May 2025 | 0.83 | Pass@1 | Yes | Source | |
| Qwen3 235B A22B | 29 Apr 2025 | 0.81 | - | Yes | Source | |
| Gemini 2.5 Flash Preview (2025-04-17) | 17 Apr 2025 | 0.78 | Thinking, Single Attempt | Yes | Source | |
| Claude Opus 4.1 | 05 Aug 2025 | 0.78 | - | Yes | Source | |
| Phi 4 Reasoning Plus | 30 Apr 2025 | 0.78 | - | Yes | Source | |
| Qwen3 32B | 29 Apr 2025 | 0.73 | - | Yes | Source | |
| Llama 3.1 Nemotron Ultra 253B v1 | 07 Apr 2025 | 0.72 | - | Yes | Source | |
| Gemini 2.5 Flash Preview (2025-05-20) | 20 May 2025 | 0.72 | Pass@1 | Yes | Source | |
| Qwen3 30B A3B | 29 Apr 2025 | 0.71 | - | Yes | Source | |
| Qwen3 A235 A22B Instruct 2507 | 21 Jul 2025 | 0.70 | - | Yes | Source | |
| Magistral Medium | 10 Jun 2025 | 0.65 | - | Yes | Source | |
| Gemini 2.5 Flash Lite Preview | 17 Jun 2025 | 0.63 | Thinking | Yes | Source | |
| Phi 4 Reasoning | 30 Apr 2025 | 0.63 | - | Yes | Source | |
| Magistral Small | 10 Jun 2025 | 0.63 | Pass@1 | Yes | Source | |
| Llama 3.3 Nemotron Super 49B v1 | 18 Mar 2025 | 0.58 | - | Yes | Source | |
| Grok 3 | 18 Apr 2025 | 0.57 | - | Yes | Source | |
| Kimi K2 Instruct | 11 Jul 2025 | 0.49 | Avg@64 | Yes | Source | |
| Llama 3.1 Nemotron Nano 8B V1 | 18 Mar 2025 | 0.47 | - | Yes | Source | |
| EXAONE 4.0 1.2B | 15 Jul 2025 | 0.45 | Reasoning | Yes | Source | |
| Gemini 2.0 Flash | 05 Feb 2025 | 0.28 | Single Attempt | Yes | Source | |
| Gemini Diffusion | 20 May 2025 | 0.23 | Pass@1 | Yes | Source |