Individual benchmark scores plotted by date.
| Organisation | Model | Reported | Top Score | Info | Self Reported | Source |
|---|---|---|---|---|---|---|
| Qwen 3.6 Plus | 01 Apr 2026 | 93.30% | - | Yes | Source | |
| Qwen 3.5 397B A17B | 16 Feb 2026 | 93% | - | Yes | Source | |
| Kimi K2 (2025-07-11) | 11 Jul 2025 | 92.50% | EM | Yes | Source | |
| Qwen 3.5 122B A10B | 24 Feb 2026 | 91.90% | - | Yes | Source | |
| Qwen 3.5 27B | 24 Feb 2026 | 90.50% | - | Yes | Source | |
| Qwen 3.5 Flash | 23 Feb 2026 | 90.50% | inferred family alias from qwen3.5-27b (score=0.4147; benches=81) | Yes | Source | |
| Qwen 3.5 35B A3B | 24 Feb 2026 | 90.20% | - | Yes | Source | |
| Kimi K1.5 | 20 Jan 2025 | 88.30% | - | Yes | Source | |
| Qwen 3.5 9B | 02 Mar 2026 | 88.20% | - | Yes | Source | |
| DeepSeek OCR | 20 Oct 2025 | 86.50% | inferred family alias from deepseek-v3 (score=0.3000; benches=20) | Yes | Source | |
| DeepSeek V2 (2024-06-28) | 28 Jun 2024 | 86.50% | inferred family alias from deepseek-v3 (score=0.4159; benches=20) | Yes | Source | |
| DeepSeek V4 | - | 86.50% | inferred high-confidence family alias from deepseek-v3 (score=0.5818; benches=20) | Yes | Source | |
| Qwen 3.5 4B | 02 Mar 2026 | 85.10% | - | Yes | Source | |
| Qwen 2 Math 72B | - | 83.80% | inferred high-confidence family alias from qwen2-72b-instruct (score=0.4667; benches=17) | Yes | Source | |
| Qwen 2 Math RM 72B | - | 83.80% | inferred family alias from qwen2-72b-instruct (score=0.3917; benches=17) | Yes | Source | |
| Qwen 2 Audio 7B | - | 77.20% | inferred modality/version alias from qwen2-7b-instruct | Yes | Source | |
| Qwen 2 Math 7B | - | 77.20% | inferred high-confidence family alias from qwen2-7b-instruct (score=0.4706; benches=14) | Yes | Source | |
| Qwen 3.5 2B | 02 Mar 2026 | 73.20% | - | Yes | Source | |
| Qwen 3.5 0.8B | 02 Mar 2026 | 50.50% | - | Yes | Source | |
| Ernie 4.5 VL 424B A47B | - | 40.70% | inferred version-family alias from ernie-4.5 | Yes | Source | |
| Ernie 4.5 VL 28B A3B | - | 40.70% | inferred version-family alias from ernie-4.5 | Yes | Source | |
| Ernie 4.5 21B A3B Thinking | - | 40.70% | inferred version-family alias from ernie-4.5 | Yes | Source | |
| Ernie 4.5 300B A47B | - | 40.70% | inferred version-family alias from ernie-4.5 | Yes | Source | |
| Ernie 4.5 Turbo | - | 40.70% | inferred version-family alias from ernie-4.5 | Yes | Source | |
| Ernie 4.5 21B A3B | - | 40.70% | inferred version-family alias from ernie-4.5 | Yes | Source |