New Release
|
Introducing AI Stats Gateway
|
Read the docs
New Release
Home
Organisations
Models
Benchmarks
API Providers
Home
Organisations
Models
Benchmarks
API Providers
Sign In
Sign In
Terminal Bench - Benchmark Leaderboard & Model Performance | AI Stats
Terminal Bench
Overview
Overview
Code
Recorded Results
3
Average Score
0.39
Score Range
0.33 - 0.43
Leading Model
0.43 - Claude Opus 4.1
Scores Over Time
Individual benchmark scores plotted by date.
Models Using This Benchmark
Organisation
Model
Reported
Top Score
Info
Self Reported
Source
Anthropic
Claude Opus 4.1
05 Aug 2025
0.43
-
Yes
Source
Amazon
Nova 2 Pro
02 Dec 2025
0.41
-
Yes
Source
Amazon
Nova 2 Lite
02 Dec 2025
0.33
-
Yes
Source