Loading
AI Stats is fetching the latest data for this page. This usually only takes a moment.
If this screen doesn't disappear after a short while, you can refresh the page or use one of the links above to continue.
Individual benchmark scores plotted by date.
| Organisation | Model | Reported | Top Score | Info | Self Reported | Source |
|---|---|---|---|---|---|---|
| Qwen3 235B A22B Thinking 2507 | 25 Jul 2025 | 0.78 | 2024-11-25 | Yes | Source | |
| Kimi K2 Instruct | 11 Jul 2025 | 0.76 | Pass@1 | Yes | Source | |
| Qwen3 A235 A22B Instruct 2507 | 21 Jul 2025 | 0.75 | 2024-11-25 | Yes | Source | |
| o3 | 16 Apr 2025 | 0.74 | High Reasoning Effort | No | Source | |
| Claude Opus 4 | 21 May 2025 | 0.73 | 32k Thinking | No | Source | |
| Claude Sonnet 4 | 21 May 2025 | 0.72 | 64k Thinking | No | Source | |
| o4 Mini | 16 Apr 2025 | 0.72 | High Reasoning Effort | No | Source | |
| Deepseek R1 (2025-05-28) | 28 May 2025 | 0.69 | - | No | Source | |
| Claude 3.7 Sonnet | 24 Feb 2025 | 0.67 | 64k Thinking | No | Source | |
| Deepseek R1 (2025-01-20) | 20 Jan 2025 | 0.65 | - | No | Source | |
| Grok 3 Beta | 19 Feb 2025 | 0.62 | High Reasoning Effort | No | Source | |
| GPT 4.5 | 27 Feb 2025 | 0.59 | - | No | Source | |
| GPT 4.1 | 14 Apr 2025 | 0.56 | - | No | Source | |
| Claude 3.5 Sonnet (2024-10-22) | 22 Oct 2024 | 0.52 | - | No | Source | |
| GPT 4.1 Mini | 14 Apr 2025 | 0.52 | - | No | Source | |
| GPT 4.1 Nano | 14 Apr 2025 | 0.40 | - | No | Source | |
| Claude 3.5 Haiku | 04 Nov 2024 | 0.40 | - | No | Source |