Qwen3 235B A22B Thinking 2507
Qwen
Highlights
Top benchmark results for qwen/qwen3-235b-a22b-thinking-2507-2025-07-25.
0.92#9
16.77#22
8.24#5
0.81#17
0.84#4
0.18#9
0.88#3
0.78#1
0.74#1
0.05#2
0.94#1
0.84#1
0.65#1
0.58#4
0.72#4
0.46#5
Benchmark table
Detailed scores across tracked benchmarks.
| Benchmark | Category | Top Score | Info | Self Reported | Source |
|---|---|---|---|---|---|
| AIME 2025 | math | 0.92 | - | Yes | Source |
| Confabulations | - | 16.77 | - | No | Source |
| Creative Story Writing | - | 8.24 | - | No | Source |
| GPQA Diamond | general-knowledge | 0.81 | - | Yes | Source |
| HMMT 2025 | - | 0.84 | - | Yes | Source |
| Humanity's Last Exam | - | 0.18 | Text Only | Yes | Source |
| IFEval | - | 0.88 | - | Yes | Source |
| LiveBench | - | 0.78 | 2024-11-25 | Yes | Source |
| LiveCodeBench V6 | - | 0.74 | - | Yes | Source |
| MathArena Apex | - | 0.05 | - | No | Source |
| MMLU Redux | - | 0.94 | - | Yes | Source |
| MMLU-Pro | - | 0.84 | - | Yes | Source |
| Online Judgement Benchmark | - | 0.33 | - | Yes | Source |
| SuperGPQA | - | 0.65 | - | Yes | Source |
| Tau 2 Airline | - | 0.58 | - | Yes | Source |
| Tau 2 Retail | - | 0.72 | - | Yes | Source |
| Tau 2 Telecom | - | 0.46 | - | Yes | Source |
Benchmark comparisons
Use the selector to switch benchmarks and see how this model stacks up against its closest competitors.
AIME 2025
Compare this model with the leading peers for the selected benchmark.
Benchmark
0.92
Rank #9/39
39 models
Showing 11 models around the selected model (out of 39 total).