Claude Opus 4.1
Anthropic
Highlights
Top benchmark results for anthropic/claude-opus-4-1-2025-08-05.
0.78#18
18.51#26
8.47#3
0.81#19
0.90#3
0.77#8
0.60#3
0.74#4
0.56#2
0.82#1
0.43#1
Benchmark table
Detailed scores across tracked benchmarks.
| Benchmark | Category | Top Score | Info | Self Reported | Source |
|---|---|---|---|---|---|
| AIME 2025 | math | 0.78 | - | Yes | Source |
| Confabulations | - | 18.51 | No Reasoning | No | Source |
| Creative Story Writing | - | 8.47 | No Reasoning | No | Source |
| GPQA Diamond | general-knowledge | 0.81 | - | Yes | Source |
| MMMLU | - | 0.90 | - | Yes | Source |
| MMMU | - | 0.77 | - | Yes | Source |
| SimpleBench | - | 0.60 | - | No | Source |
| SWE-Bench | code | 0.74 | - | Yes | Source |
| Tau Bench (Airline) | - | 0.56 | - | Yes | Source |
| Tau Bench (Retail) | - | 0.82 | - | Yes | Source |
| Terminal Bench | code | 0.43 | - | Yes | Source |
Benchmark comparisons
Use the selector to switch benchmarks and see how this model stacks up against its closest competitors.
Creative Story Writing
Compare this model with the leading peers for the selected benchmark.
Benchmark
8.47
Rank #3/7
7 models
Showing 7 models around the selected model (out of 7 total).