Individual benchmark scores plotted by date.
| Organisation | Model | Reported | Top Score | Info | Self Reported | Source |
|---|---|---|---|---|---|---|
| o3 Pro | 10 Jun 2025 | 2748 | - | Yes | Source | |
| DeepSeek V3.2 Speciale | 01 Dec 2025 | 2701 | - | Yes | Source | |
| GPT OSS 120b | 05 Aug 2025 | 2622 | High Reasoning Effort, With Tools | Yes | Source | |
| o3 | 16 Apr 2025 | 2517 | - | Yes | Source | |
| GPT OSS 20b | 05 Aug 2025 | 2516 | High Reasoning Effort, With Tools | Yes | Source | |
| Gemma 4 31B | 02 Apr 2026 | 2150 | ELO | Yes | Source | |
| Gemma 4 26B A4B | 02 Apr 2026 | 1718 | ELO | Yes | Source | |
| o1 pro | 19 Mar 2025 | 1707 | - | Yes | Source | |
| Qwen 3.5 122B A10B | 24 Feb 2026 | 0.85 | Raw score: 2100 | Yes | Source | |
| Qwen 3.5 35B A3B | 24 Feb 2026 | 0.82 | Raw score: 2028 | Yes | Source | |
| GPT OSS Safeguard 120b | 29 Oct 2025 | 0.82 | inferred high-confidence family alias from gpt-oss-120b (score=0.5102; benches=7) | Yes | Source | |
| Qwen 3.5 27B | 24 Feb 2026 | 0.81 | Raw score: 1899 | Yes | Source | |
| Qwen 3.5 Flash | 23 Feb 2026 | 0.81 | inferred family alias from qwen3.5-27b (score=0.4147; benches=81) | Yes | Source | |
| GPT OSS Safeguard 20b | 29 Oct 2025 | 0.74 | inferred high-confidence family alias from gpt-oss-20b (score=0.5137; benches=7) | Yes | Source | |
| Deepseek V3.2 Exp | 29 Sept 2025 | 0.71 | Raw rating ≈ 2121; normalized by 3000 max | Yes | Source | |
| DeepSeek OCR 2 | - | 0.71 | inferred family alias from deepseek-v3.2-exp (score=0.3809; benches=14) | Yes | Source | |
| DeepSeek V3.1 Terminus | 22 Sept 2025 | 0.70 | inferred alias from deepseek-v3.1 | Yes | Source | |
| DeepSeek V3.1 | 21 Aug 2025 | 0.70 | Codeforces Div1 rating in thinking mode | Yes | Source | |
| Qwen 3 32B | - | 0.66 | - | Yes | Source |