Docs
Search
Ctrl K
Models
Playground
Compare
Providers
Apps
Rankings
Models
Playground
Compare
Providers
Apps
Rankings
Docs
Search
Ctrl K
Sign In
Sign In
Terminal Bench 2.1 - Benchmark Leaderboard & Model Performance | AI Stats
Terminal Bench 2.1
Overview
Overview
Type: percentage
Code
View benchmark source
Recorded Results
2
Average Score
75.40%
Score Range
74.60% - 76.20%
Leading Model
76.20% - Gemini 3.5 Flash
Scores Over Time
Individual benchmark scores plotted by date.
Models Using This Benchmark
Organisation
Model
Reported
Top Score
Info
Self Reported
Source
Google
Gemini 3.5 Flash
19 May 2026
76.20%
-
Yes
Source
Anthropic
Claude Opus 4.8
28 May 2026
74.60%
Terminus-2 public harness; max effort
Yes
Source