GPT 4.1 Mini
OpenAI
Highlights
Top benchmark results for openai/gpt-4-1-mini-2025-04-14.
1026#9
0.32#29
0.50#31
0.04#28
0#19
1145#17
0.65#44
0.62#6
0.60#4
0.52#15
1372#19
1189#12
0.15#27
0.68#6
Benchmark table
Detailed scores across tracked benchmarks.
| Benchmark | Category | Top Score | Info | Self Reported | Source |
|---|---|---|---|---|---|
| Ai2 SciArena | - | 1026 | - | No | Source |
| Aider-Polyglot | code | 0.32 | - | No | Source |
| AIME 2024 | math | 0.50 | - | Yes | Source |
| ARC-AGI-1 | - | 0.04 | - | No | Source |
| ARC-AGI-2 | - | 0 | - | No | Source |
| BrowseComp Long Context 128k | - | 0.89 | - | Yes | Source |
| BrowseComp Long Context 256k | - | 0.82 | - | Yes | Source |
| EQ-Bench 3 | - | 1145 | - | No | Source |
| FActScore hallucination rate | hallucinations | 0.11 | - | Yes | Source |
| GPQA Diamond | general-knowledge | 0.65 | - | Yes | Source |
| Graphwalks bfs <128k | - | 0.62 | - | Yes | Source |
| Graphwalks parents <128k | - | 0.60 | - | Yes | Source |
| LiveBench | - | 0.52 | - | No | Source |
| LMArena Text | - | 1372 | - | No | Source |
| LMArena WebDev | - | 1189 | 16th June 2025 | No | Source |
| LongFact-Concepts hallucination rate | hallucinations | 0.01 | - | Yes | Source |
| LongFact-Objects hallucination rate | hallucinations | 0.02 | - | Yes | Source |
| NYT Connections | - | 0.15 | - | No | Source |
| OpenAI-MRCR: 2 needle 128k | - | 0.47 | - | Yes | Source |
| OpenAI-MRCR: 2 needle 256k | - | 0.46 | - | Yes | Source |
| VideoMME | - | 0.68 | - | Yes | Source |
Benchmark comparisons
Use the selector to switch benchmarks and see how this model stacks up against its closest competitors.
Aider-Polyglot
Compare this model with the leading peers for the selected benchmark.
Benchmark
0.32
Rank #29/34
34 models
Showing 11 models around the selected model (out of 34 total).