Individual benchmark scores plotted by date.
| Organisation | Model | Reported | Top Score | Info | Self Reported | Source |
|---|---|---|---|---|---|---|
| Claude Mythos Preview | 07 Apr 2026 | 77.80% | - | Yes | Source | |
| Claude Opus 4.7 | 16 Apr 2026 | 64.30% | - | Yes | Source | |
| GLM 5.1 | - | 58.40% | - | Yes | Source | |
| GPT 5 Pro | 07 Aug 2025 | 57.70% | inferred family alias from gpt-5.4 (score=0.4083; benches=19) | Yes | Source | |
| GPT 5.4 | 05 Mar 2026 | 57.70% | Public | Yes | Source | |
| GPT 5 Search API | 14 Oct 2025 | 57.70% | inferred family alias from gpt-5.4 (score=0.3050; benches=19) | Yes | Source | |
| GPT 5.3 Codex | 05 Feb 2026 | 56.80% | Public; xhigh reasoning | Yes | Source | |
| Qwen 3.6 Plus | 01 Apr 2026 | 56.60% | - | Yes | Source | |
| GPT 5.2 Codex | 18 Dec 2025 | 56.40% | - | Yes | Source | |
| MiniMax M2.7 | 18 Mar 2026 | 56.20% | - | Yes | Source | |
| GPT 5.2 | 11 Dec 2025 | 55.60% | - | Yes | Source | |
| MiniMax M2.5 | 12 Feb 2026 | 55.40% | - | Yes | Source | |
| GPT 5.4 Mini | 17 Mar 2026 | 54.40% | Public | Yes | Source | |
| Gemini 3.1 Pro Preview | 19 Feb 2026 | 54.20% | Single Attempt | Yes | - | |
| GPT 5.4 Nano | 17 Mar 2026 | 52.40% | Public | Yes | Source | |
| Muse Spark | 08 Apr 2026 | 52.40% | - | Yes | Source | |
| Claude Opus 4.5 | 24 Nov 2025 | 52% | Avg@5 | Yes | Source | |
| Kimi K2.5 | 27 Jan 2026 | 50.70% | - | Yes | Source | |
| Seed 2.0 Pro | 14 Feb 2026 | 46.90% | Seed2 official benchmark table | SWE-Bench Pro | Yes | Source | |
| Seed 2.0 Lite | 14 Feb 2026 | 46% | Seed2 official benchmark table | SWE-Bench Pro | Yes | Source |