GPT 5 Nano
OpenAI
Highlights
Top benchmark results for openai/gpt-5-nano-2025-08-07.
0.48#22
0.85#17
0.21#21
0.03#13
0.63#3
0.97#3
0.50#3
0.10#4
0.71#36
0.64#4
0.44#7
0.76#7
0.09#18
0.55#5
0.76#10
0.63#4
0.55#14
0.41#7
0.62#7
0.35#7
0.67#3
0.66#7
Benchmark table
| Benchmark | Category | Top Score | Info | Self Reported | Source |
|---|---|---|---|---|---|
| Aider-Polyglot | code | 0.48 | High Reasoning Effort, Diff Method | Yes | Source |
| AIME 2025 | math | 0.85 | High Reasoning Effort, No Tools | Yes | Source |
| ARC-AGI-1 | - | 0.21 | Medium Reasoning Effort | No | Source |
| ARC-AGI-2 | - | 0.03 | High Reasoning Effort | No | Source |
| BrowseComp Long Context 128k | - | 0.80 | High Reasoning Effort | Yes | Source |
| BrowseComp Long Context 256k | - | 0.68 | High Reasoning Effort | Yes | Source |
| CharXiv-Reasoning | - | 0.63 | High Reasoning Effort | Yes | Source |
| COLLIE | - | 0.97 | High Reasoning Effort | Yes | Source |
| ERQA | - | 0.50 | High Reasoning Effort | Yes | Source |
| FActScore hallucination rate | hallucinations | 0.07 | High Reasoning Effort | Yes | Source |
| Frontier Math | math | 0.10 | With Thinking, With Python, Pass @ 1 | Yes | Source |
| GPQA Diamond | general-knowledge | 0.71 | High Reasoning Effort, No Tools | Yes | Source |
| Graphwalks bfs <128k | - | 0.64 | High Reasoning Effort | Yes | Source |
| Graphwalks parents <128k | - | 0.44 | High Reasoning Effort | Yes | Source |
| HMMT 2025 | - | 0.76 | High Reasoning Effort, No Tools | Yes | Source |
| Humanity's Last Exam | - | 0.09 | High Reasoning Effort, No Tools | Yes | Source |
| LongFact-Concepts hallucination rate | hallucinations | 0.01 | High Reasoning Effort | Yes | Source |
| LongFact-Objects hallucination rate | hallucinations | 0.03 | High Reasoning Effort | Yes | Source |
| MMLU Pro | - | 0.55 | High Reasoning Effort | Yes | Source |
| MMMU | - | 0.76 | High Reasoning Effort | Yes | Source |
| MMMU Pro | - | 0.63 | High Reasoning Effort | Yes | Source |
| OpenAI-MRCR: 2 needle 128k | - | 0.43 | High Reasoning Effort | Yes | Source |
| OpenAI-MRCR: 2 needle 256k | - | 0.35 | High Reasoning Effort | Yes | Source |
| SWE-Bench | code | 0.55 | High Reasoning Effort, No Tools | Yes | Source |
| Tau 2 Airline | - | 0.41 | High Reasoning Effort | Yes | Source |
| Tau 2 Retail | - | 0.62 | High Reasoning Effort | Yes | Source |
| Tau 2 Telecom | - | 0.35 | High Reasoning Effort | Yes | Source |
| Video MMMU | - | 0.67 | High Reasoning Effort | Yes | Source |
| VideoMME | - | 0.66 | High Reasoning Effort | Yes | Source |
Benchmark comparisons
Use the selector to switch benchmarks and see how this model stacks up against its closest competitors.
LongFact-Concepts hallucination rate
Compare this model with the leading peers for the selected benchmark.
Benchmark
0.01
Rank #2/7
7 models
Showing 7 models around the selected model (out of 7 total).