Loading...
Loading...
AI Stats
Home
Comparisons
Providers
Models
Benchmarks
Prices
Open menu
Aider-Polyglot
Twitter
35
Total Models
56.05
Average Score
8.90 - 88.00
Score Range
1
Max Score Achievable
Top 10 Model Performance
Top 10 of 35
Models Using This Benchmark
(35)
OpenAI
(15 models)
GPT-5
openai
88.00%
o3 Pro
openai
84.90%
o3
openai
81.30%
o4 Mini
openai
72.00%
GPT-5 mini
openai
71.60%
o1
openai
61.70%
o3-mini
openai
60.40%
GPT-4.1
openai
52.40%
GPT-5 nano
openai
48.40%
GPT-4.5
openai
44.90%
gpt-oss-120b
openai
44.40%
gpt-oss-20b
openai
34.20%
o1 mini
openai
32.90%
GPT-4.1 Mini
openai
32.40%
GPT-4.1 Nano
openai
8.90%
Google
(7 models)
Gemini 2.5 Pro Preview
google
82.20%
Gemini 2.5 Pro Preview
google
76.50%
Gemini 2.5 Pro Experimental
google
74.00%
Gemini 2.5 Flash Preview
google
61.90%
Gemini 2.5 Flash Preview
google
51.10%
Gemini 2.5 Flash Lite Preview
google
26.70
Gemini 2.0 Flash
google
22.20%
Anthropic
(5 models)
Claude Opus 4
anthropic
72.00%
Claude 3.7 Sonnet
anthropic
64.90%
Claude Sonnet 4
anthropic
61.30%
Claude 3.5 Sonnet
anthropic
51.60%
Claude 3.5 Haiku
anthropic
28.00%
DeepSeek
(2 models)
R1
deepseek
56.90%
DeepSeek-V3 0324
deepseek
55.10%
Qwen
(2 models)
Qwen3 Coder 480B A35B Instruct
qwen
61.80%
Qwen3 A235 A22B Instruct 2507
qwen
57.30%
xAI
(2 models)
Grok 4
x-ai
79.60%
Grok 3 Beta
x-ai
53.30%
Mistral
(1 model)
Magistral Medium
mistral
47.10%
Moonshot
(1 model)
Kimi K2 Instruct
moonshotai
60.00%