GPT 5 nano
OpenAI
Highlights
Top benchmark results for openai/gpt-5-nano-2025-08-07.
0.48#23
0.85#15
0.21#20
0.03#12
0.63#4
0.97#3
0.50#3
0.10#3
0.71#34
0.64#4
0.44#7
0.76#5
0.09#17
0.55#3
0.76#10
0.63#4
0.55#14
0.41#6
0.62#6
0.35#6
0.67#4
0.66#7
Benchmark table
Detailed scores across tracked benchmarks.
| Benchmark | Category | Top Score | Info | Self Reported | Source |
|---|---|---|---|---|---|
| Aider-Polyglot | code | 0.48 | High Reasoning Effort, Diff Method | Yes | Source |
| AIME 2025 | math | 0.85 | High Reasoning Effort, No Tools | Yes | Source |
| ARC-AGI-1 | - | 0.21 | Medium Reasoning Effort | No | Source |
| ARC-AGI-2 | - | 0.03 | High Reasoning Effort | No | Source |
| BrowseComp Long Context 128k | - | 0.80 | High Reasoning Effort | Yes | Source |
| BrowseComp Long Context 256k | - | 0.68 | High Reasoning Effort | Yes | Source |
| CharXiv-Reasoning | - | 0.63 | High Reasoning Effort | Yes | Source |
| COLLIE | - | 0.97 | High Reasoning Effort | Yes | Source |
| ERQA | - | 0.50 | High Reasoning Effort | Yes | Source |
| FActScore hallucination rate | hallucinations | 0.07 | High Reasoning Effort | Yes | Source |
| Frontier Math | math | 0.10 | With Thinking, With Python, Pass @ 1 | Yes | Source |
| GPQA Diamond | general-knowledge | 0.71 | High Reasoning Effort, No Tools | Yes | Source |
| Graphwalks bfs <128k | - | 0.64 | High Reasoning Effort | Yes | Source |
| Graphwalks parents <128k | - | 0.44 | High Reasoning Effort | Yes | Source |
| HMMT 2025 | - | 0.76 | High Reasoning Effort, No Tools | Yes | Source |
| Humanity's Last Exam | - | 0.09 | High Reasoning Effort, No Tools | Yes | Source |
| LongFact-Concepts hallucination rate | hallucinations | 0.01 | High Reasoning Effort | Yes | Source |
| LongFact-Objects hallucination rate | hallucinations | 0.03 | High Reasoning Effort | Yes | Source |
| MMLU Pro | - | 0.55 | High Reasoning Effort | Yes | Source |
| MMMU | - | 0.76 | High Reasoning Effort | Yes | Source |
| MMMU Pro | - | 0.63 | High Reasoning Effort | Yes | Source |
| OpenAI-MRCR: 2 needle 128k | - | 0.43 | High Reasoning Effort | Yes | Source |
| OpenAI-MRCR: 2 needle 256k | - | 0.35 | High Reasoning Effort | Yes | Source |
| SWE-Bench | code | 0.55 | High Reasoning Effort, No Tools | Yes | Source |
| Tau 2 Airline | - | 0.41 | High Reasoning Effort | Yes | Source |
| Tau 2 Retail | - | 0.62 | High Reasoning Effort | Yes | Source |
| Tau 2 Telecom | - | 0.35 | High Reasoning Effort | Yes | Source |
| Video MMMU | - | 0.67 | High Reasoning Effort | Yes | Source |
| VideoMME | - | 0.66 | High Reasoning Effort | Yes | Source |
Benchmark comparisons
Use the selector to switch benchmarks and see how this model stacks up against its closest competitors.
Frontier Math
Compare this model with the leading peers for the selected benchmark.
Benchmark
0.10
Rank #3/3
3 models
Showing 3 models around the selected model (out of 3 total).