Individual benchmark scores plotted by date.
| Organisation | Model | Reported | Top Score | Info | Self Reported | Source |
|---|---|---|---|---|---|---|
| Qwen 3 A235 A22B Instruct 2507 | - | 87.90 | - | Yes | Source | |
| Qwen 3 Next 80B A3B Instruct | - | 87.80 | LLM Stats (ZeroEval) | Yes | Source | |
| Qwen 3 VL 235B A22B Instruct | - | 86.10 | LLM Stats (ZeroEval) | Yes | Source | |
| Kimi K2 (2025-09-05) | 05 Sept 2025 | 85.70 | Pass@1 | Yes | Source | |
| Qwen 72B | - | 75.10 | LLM Stats (ZeroEval) | inferred family alias from qwen-2.5-72b-instruct (score=0.3060; benches=14) | Yes | Source | |
| Qwen 14B | - | 72.80 | LLM Stats (ZeroEval) | inferred family alias from qwen-2.5-14b-instruct (score=0.3060; benches=16) | Yes | Source | |
| Qwen 7B | - | 70.40 | LLM Stats (ZeroEval) | inferred family alias from qwen-2.5-7b-instruct (score=0.3083; benches=14) | Yes | Source | |
| Qwen 2 Math 72B | - | 69.20 | LLM Stats (ZeroEval) | inferred high-confidence family alias from qwen2-72b-instruct (score=0.4667; benches=17) | Yes | Source | |
| Qwen 2 Math RM 72B | - | 69.20 | LLM Stats (ZeroEval) | inferred family alias from qwen2-72b-instruct (score=0.3917; benches=17) | Yes | Source | |
| Qwen 3 235B A22B | - | 65.94 | LLM Stats (ZeroEval) | Yes | Source | |
| Qwen 2.5 Coder 3B | - | 65.80 | LLM Stats (ZeroEval) | inferred family alias from qwen2.5-omni-7b (score=0.3000; benches=45) | Yes | Source | |
| Qwen 2.5 Coder 7B | - | 65.80 | LLM Stats (ZeroEval) | inferred high-confidence family alias from qwen2.5-omni-7b (score=0.4700; benches=45) | Yes | Source | |
| Qwen 2.5 Math 7B | - | 65.80 | LLM Stats (ZeroEval) | inferred high-confidence family alias from qwen2.5-omni-7b (score=0.4767; benches=45) | Yes | Source | |
| Qwen 2.5 Math 7B PRM800K | - | 65.80 | LLM Stats (ZeroEval) | inferred family alias from qwen2.5-omni-7b (score=0.3696; benches=45) | Yes | Source | |
| Qwen 2.5 Math PRM 7B | - | 65.80 | LLM Stats (ZeroEval) | inferred family alias from qwen2.5-omni-7b (score=0.4092; benches=45) | Yes | Source | |
| Qwen 2.5 Omni 3B | - | 65.80 | LLM Stats (ZeroEval) | inferred high-confidence family alias from qwen2.5-omni-7b (score=0.4933; benches=45) | Yes | Source | |
| Qwen 2.5 Omni 7B | - | 65.80 | LLM Stats (ZeroEval) | Yes | Source | |
| Qwen 2 Audio 7B | - | 59.10 | LLM Stats (ZeroEval) | inferred modality/version alias from qwen2-7b-instruct | Yes | Source | |
| Qwen 2 Math 7B | - | 59.10 | LLM Stats (ZeroEval) | inferred high-confidence family alias from qwen2-7b-instruct (score=0.4706; benches=14) | Yes | Source |