Loading
AI Stats is fetching the latest data for this page. This usually only takes a moment.
If this screen doesn't disappear after a short while, you can refresh the page or use one of the links above to continue.
Individual benchmark scores plotted by date.
| Organisation | Model | Reported | Top Score | Info | Self Reported | Source |
|---|---|---|---|---|---|---|
| Claude Opus 4.5 | 24 Nov 2025 | 0.81 | Avg@5, 64k Thinking | Yes | Source | |
| Gemini 3.0 Pro Preview | 18 Nov 2025 | 0.76 | Single Attempt | Yes | Source | |
| GPT 5 | 07 Aug 2025 | 0.75 | With Thinking, Pass @ 1 | Yes | Source | |
| Claude Opus 4.1 | 05 Aug 2025 | 0.74 | - | Yes | Source | |
| Kimi K2 Instruct | 11 Jul 2025 | 0.72 | Multiple Attempts (Acc) | Yes | Source | |
| GPT 5 mini | 07 Aug 2025 | 0.71 | High Reasoning Effort, No Tools | Yes | Source | |
| Qwen3 Coder 480B A35B Instruct | 22 Jul 2025 | 0.70 | OpenHands Scaffold, 500 Turns | Yes | Source | |
| Gemini 2.5 Pro Preview (2025-06-05) | 05 Jun 2025 | 0.67 | Multiple Attempts | Yes | Source | |
| Gemini 2.5 Pro Preview (2025-05-06) | 06 May 2025 | 0.63 | - | Yes | Source | |
| GPT OSS 120b | 05 Aug 2025 | 0.62 | High Reasoning Effort | Yes | Source | |
| Devstral Medium 1.1 | 10 Jul 2025 | 0.62 | - | Yes | Source | |
| GPT OSS 20b | 05 Aug 2025 | 0.61 | High Reasoning Effort | Yes | Source | |
| Gemini 2.5 Flash Preview (2025-05-20) | 20 May 2025 | 0.60 | - | Yes | Source | |
| GPT 5 nano | 07 Aug 2025 | 0.55 | High Reasoning Effort, No Tools | Yes | Source | |
| GPT 4.1 | 14 Apr 2025 | 0.55 | - | Yes | Source | |
| Devstral Small 1.1 | 10 Jul 2025 | 0.54 | OpenHands Scaffold | Yes | Source | |
| Gemini 2.5 Flash Lite Preview | 17 Jun 2025 | 0.45 | Thinking, Multiple Attempts | Yes | Source | |
| GPT 4.5 | 27 Feb 2025 | 0.38 | - | Yes | Source | |
| GPT 4o | 06 Aug 2024 | 0.33 | - | Yes | Source | |
| Gemini Diffusion | 20 May 2025 | 0.23 | Pass@1 | Yes | Source | |
| GPT 4o Mini (2024-07-18) | 18 Jul 2024 | 0.09 | - | Yes | Source |