o3 mini
OpenAI
Highlights
Top benchmark results for openai/o3-mini-2025-01-30.
4984#2
0.60#14
0.87#11
0.34#14
0.03#11
18.43#25
5.51#10
0.80#21
324#4
1364#20
1136#15
0.61#7
0.23#26
1.85#8
Benchmark table
Detailed scores across tracked benchmarks.
| Benchmark | Category | Top Score | Info | Self Reported | Source |
|---|---|---|---|---|---|
| AidanBench | - | 4984 | High Reasoning Effort | No | Source |
| Aider-Polyglot | code | 0.60 | High Reasoning Effort | No | Source |
| AIME 2024 | math | 0.87 | - | Yes | Source |
| ARC-AGI-1 | - | 0.34 | High Reasoning Effort | No | Source |
| ARC-AGI-2 | - | 0.03 | High Reasoning Effort | No | Source |
| Confabulations | - | 18.43 | High Reasoning Effort | No | Source |
| Elimation Game | - | 5.51 | Medium Reasoning Effort | No | Source |
| GPQA Diamond | general-knowledge | 0.80 | High Reasoning Effort | Yes | - |
| LisanBench | - | 324 | - | No | Source |
| LMArena Text | - | 1364 | High Reasoning Effort | No | Source |
| LMArena WebDev | - | 1136 | High Reasoning Effort, 16th June 2025 | No | Source |
| NYT Connections | - | 0.61 | High Reasoning Effort | No | Source |
| SimpleBench | - | 0.23 | High Reasoning Effort | No | Source |
| Thematic Generalisation | - | 1.85 | Medium Reasoning Effort | No | Source |
Benchmark comparisons
Use the selector to switch benchmarks and see how this model stacks up against its closest competitors.
Confabulations
Compare this model with the leading peers for the selected benchmark.
Benchmark
17.91
Rank #25/43
43 models
Lower is better
Lower scores indicate stronger performance.
Showing 11 models around the selected model (out of 43 total).