Loading
AI Stats is fetching the latest data for this page. This usually only takes a moment.
If this screen doesn't disappear after a short while, you can refresh the page or use one of the links above to continue.
Individual benchmark scores plotted by date.
| Organisation | Model | Reported | Top Score | Info | Self Reported | Source |
|---|---|---|---|---|---|---|
| Gemma 3 27B | 12 Mar 2025 | 40.26 | - | No | Source | |
| GPT 4o Mini (2024-07-18) | 18 Jul 2024 | 37.21 | - | No | Source | |
| Claude 3.5 Haiku | 04 Nov 2024 | 36.74 | - | No | Source | |
| Claude 3 Haiku | 13 Mar 2024 | 34.21 | - | No | Source | |
| Nova Pro 1.0 | 04 Dec 2024 | 30.05 | - | No | Source | |
| Phi 4 | 12 Dec 2024 | 29.43 | - | No | Source | |
| Gemma 2 27B | 27 Jun 2024 | 27.24 | - | No | Source | |
| DeepSeek V3 (2025-03-24) | 25 Mar 2025 | 26.15 | - | No | Source | |
| Llama 3.3 70B Instruct | 06 Dec 2024 | 22.81 | - | No | Source | |
| Claude 3 Opus | 04 Mar 2024 | 22.70 | - | No | Source | |
| Mistral Large 2 | 24 Jul 2024 | 21.40 | - | No | Source | |
| Kimi K2 Instruct | 11 Jul 2025 | 20.38 | - | No | Source | |
| Grok 2 | 13 Aug 2024 | 20.14 | - | No | Source | |
| Claude 3.5 Sonnet (2024-10-22) | 22 Oct 2024 | 19.94 | - | No | Source | |
| Claude 3.7 Sonnet | 24 Feb 2025 | 19.76 | - | No | Source | |
| Qwen2.5 72B Instruct | 19 Sept 2024 | 19.09 | - | No | Source | |
| o1 mini | 12 Sept 2024 | 18.55 | High Reasoning Effort | No | Source | |
| Claude Opus 4.1 | 05 Aug 2025 | 18.51 | No Reasoning | No | Source | |
| o3 mini | 30 Jan 2025 | 18.43 | High Reasoning Effort | No | Source | |
| GPT 4o | 06 Aug 2024 | 17.21 | - | No | Source | |
| Claude Opus 4 | 21 May 2025 | 17.06 | No Reasoning | No | Source | |
| Qwen3 235B A22B Thinking 2507 | 25 Jul 2025 | 16.77 | - | No | Source | |
| o4 Mini | 16 Apr 2025 | 15.79 | High Reasoning Effort | No | Source | |
| GPT OSS 120b | 05 Aug 2025 | 15.65 | Medium Reasoning Effort | No | Source | |
| Qwen3 235B A22B | 29 Apr 2025 | 15.41 | - | No | Source | |
| GPT 4o | 20 Nov 2024 | 15.34 | - | No | Source | |
| Claude Sonnet 4 | 21 May 2025 | 14.85 | No Reasoning | No | Source | |
| Deepseek R1 (2025-05-28) | 28 May 2025 | 14.56 | - | No | Source | |
| o3 | 16 Apr 2025 | 14.38 | High Reasoning Effort | No | Source | |
| o3 Pro | 10 Jun 2025 | 14.22 | Medium Reasoning Effort | No | Source | |
| Grok 3 Beta | 19 Feb 2025 | 14.19 | No Reasoning | No | Source | |
| Grok 3 Mini Beta | 19 Feb 2025 | 14.04 | Low Reasoning Effort | No | Source | |
| GPT 4.5 | 27 Feb 2025 | 13.64 | - | No | Source | |
| GPT 5 mini | 07 Aug 2025 | 13.28 | - | No | Source | |
| o1 preview | 12 Sept 2024 | 13.04 | - | No | Source | |
| Deepseek R1 (2025-01-20) | 20 Jan 2025 | 12.65 | - | No | Source | |
| Grok 4 | 10 Jul 2025 | 12.41 | - | No | Source | |
| Gemini 2.5 Pro Preview (2025-06-05) | 05 Jun 2025 | 12.38 | - | No | Source | |
| Qwen3 30B A3B | 29 Apr 2025 | 12.28 | - | No | Source | |
| o1 | 17 Dec 2024 | 11.74 | Medium Reasoning Effort | No | Source | |
| GLM 4.5 | - | 11.30 | - | No | Source | |
| Gemini 2.5 Pro Preview (2025-05-06) | 06 May 2025 | 10.62 | - | No | Source | |
| GPT 5 | 07 Aug 2025 | 10.34 | Medium Reasoning Effort | No | Source |