Individual benchmark scores plotted by date.
| Organisation | Model | Reported | Top Score | Info | Self Reported | Source |
|---|---|---|---|---|---|---|
| Kimi K2 Thinking | 06 Nov 2025 | 48.70% | inferred alias from kimi-k2-thinking-0905 | Yes | Source | |
| Qwen 3.5 27B | 24 Feb 2026 | 40.10% | - | Yes | Source | |
| Qwen 3.5 Flash | 23 Feb 2026 | 40.10% | inferred family alias from qwen3.5-27b (score=0.4147; benches=81) | Yes | Source | |
| Qwen 3.5 122B A10B | 24 Feb 2026 | 39.50% | - | Yes | Source | |
| Qwen 3.5 35B A3B | 24 Feb 2026 | 36% | - | Yes | Source | |
| Qwen 3 235B A22B Thinking 2507 | - | 32.50% | - | Yes | Source | |
| Qwen 3 Next 80B A3B Thinking | - | 29.70% | - | Yes | Source | |
| Kimi K2 (2025-09-05) | 05 Sept 2025 | 27.10% | Pass@1 | Yes | Source |