Claude 3.5 Sonnet (2024-10-22)
Anthropic
Highlights
Top benchmark results for anthropic/claude-3-5-sonnet-2024-10-22.
2734#4
0.52#21
0.16#38
19.94#30
6.26#5
1068#21
0.65#44
167#7
0.52#14
1364#20
1238#8
0.18#25
0.41#10
1.93#12
Benchmark table
Detailed scores across tracked benchmarks.
| Benchmark | Category | Top Score | Info | Self Reported | Source |
|---|---|---|---|---|---|
| AidanBench | - | 2734 | - | No | Source |
| Aider-Polyglot | code | 0.52 | - | No | Source |
| AIME 2024 | math | 0.16 | - | Yes | - |
| Confabulations | - | 19.94 | - | No | Source |
| Elimation Game | - | 6.26 | - | No | Source |
| EQ-Bench 3 | - | 1068 | - | No | Source |
| GPQA Diamond | general-knowledge | 0.65 | - | Yes | - |
| LisanBench | - | 167 | - | No | Source |
| LiveBench | - | 0.52 | - | No | Source |
| LMArena Text | - | 1364 | - | No | Source |
| LMArena WebDev | - | 1238 | 16th June 2025 | No | Source |
| NYT Connections | - | 0.18 | - | No | Source |
| SimpleBench | - | 0.41 | - | No | Source |
| Thematic Generalisation | - | 1.93 | - | No | Source |
Benchmark comparisons
Use the selector to switch benchmarks and see how this model stacks up against its closest competitors.
LMArena Text
Compare this model with the leading peers for the selected benchmark.
Benchmark
1364
Rank #20/74
74 models
Showing 11 models around the selected model (out of 74 total).