Loading

Preparing your page

AI Stats is fetching the latest data for this page. This usually only takes a moment.

Return to the main page and start from a clean slate.

Explore the catalogue of AI models and their details.

Check the documentation for guides, references, and examples.

Explore the Gateway

Access the largest unified AI Gateway and use over 300+ models easily.

If this screen doesn't disappear after a short while, you can refresh the page or use one of the links above to continue.

Qwen3 A235 A22B Instruct 2507 Benchmarks - Performance Metrics & Comparisons | AI Stats

Qwen3 A235 A22B Instruct 2507

Qwen3 A235 A22B Instruct 2507

Qwen

Overview Family Timeline Benchmarks Availability Pricing Quickstart Performance

Highlights

Top benchmark results for qwen/qwen3-a235-a22b-instruct-2507-2025-07-21.

0.57#15

0.70#25

0.42#11

0.84#1

0.78#29

0.55#9

0.89#1

0.75#2

LiveCodeBench V6

0.52#3

0.93#2

0.83#3

Multi‑Programming Language Evaluation

0.88#1

0.54#1

0.63#2

Tau Bench (Airline)

0.44#5

Tau Bench (Retail)

0.71#3

0.95#1

Benchmark table

Benchmark	Category	Top Score	Info	Self Reported	Source
Aider-Polyglot	code	0.57	-	Yes	Source
AIME 2025	math	0.70	-	Yes	Source
ARC-AGI-1	-	0.42	NOT confirmed by Arc-AGI	Yes	Source
CSimpleQA	-	0.84	-	Yes	Source
GPQA Diamond	general-knowledge	0.78	-	Yes	Source
HMMT 2025	-	0.55	-	Yes	Source
IFEval	-	0.89	-	Yes	Source
LiveBench	-	0.75	2024-11-25	Yes	Source
LiveCodeBench V6	-	0.52	-	Yes	Source
MMLU Redux	-	0.93	-	Yes	Source
MMLU-Pro	-	0.83	-	Yes	Source
Multi‑Programming Language Evaluation	-	0.88	-	Yes	Source
SimpleQA	-	0.54	-	Yes	Source
SuperGPQA	-	0.63	-	Yes	Source
Tau Bench (Airline)	-	0.44	-	Yes	Source
Tau Bench (Retail)	-	0.71	-	Yes	Source
ZebraLogic	-	0.95	-	Yes	Source

Benchmark comparisons

Use the selector to switch benchmarks and see how this model stacks up against its closest competitors.

Aider-Polyglot

Compare this model with the leading peers for the selected benchmark.

Benchmark

0.57

Rank #15/32

32 models

Showing 11 models around the selected model (out of 32 total).

View benchmark page