Loading...
Loading...
AI Stats
Home
Comparisons
Providers
Models
Benchmarks
Prices
Open menu
SWE-Bench
Twitter
22
Total Models
56.35
Average Score
9.00 - 74.90
Score Range
1
Max Score Achievable
Top 10 Model Performance
Top 10 of 22
Models Using This Benchmark
(22)
OpenAI
(9 models)
GPT-5
openai
74.90%
GPT-5 mini
openai
71.00%
gpt-oss-120b
openai
62.40%
gpt-oss-20b
openai
60.70%
GPT-5 nano
openai
54.70%
GPT-4.1
openai
54.60%
GPT-4.5
openai
38.00%
GPT-4o
openai
33.00%
GPT-4o-mini
openai
9.00%
Google
(6 models)
Gemini 2.5 Pro Preview
google
67.20%
Gemini 2.5 Pro Experimental
google
63.80%
Gemini 2.5 Pro Preview
google
63.20%
Gemini 2.5 Flash Preview
google
60.40%
Gemini 2.5 Flash Lite Preview
google
44.90%
Gemini Diffusion
google
22.90%
Mistral
(2 models)
Devstral Medium 1.1
mistral
61.60%
Devstral Small 1.1
mistral
53.60%
Anthropic
(1 model)
Claude Opus 4.1
anthropic
74.50%
MiniMax
(1 model)
MiniMax M1
minimax
56.00%
Moonshot
(1 model)
Kimi K2 Instruct
moonshotai
71.60%
Qwen
(1 model)
Qwen3 Coder 480B A35B Instruct
qwen
69.60%
xAI
(1 model)
Grok 4 Code
x-ai
72.00%