Search...
Ctrl K
Models
Providers
Apps
Rankings
Playground
Models
Providers
Apps
Rankings
Playground
Search...
Ctrl K
Sign In
Sign In
HealthBench Concensus - Benchmark Leaderboard & Model Performance | AI Stats
HealthBench Concensus
Overview
Overview
Type: percentage
Health
Recorded Results
6
Average Score
86.97%
Score Range
82.60% - 90.80%
Leading Model
90.80% - GPT OSS 120b
Scores Over Time
Individual benchmark scores plotted by date.
Models Using This Benchmark
Organisation
Model
Reported
Top Score
Info
Self Reported
Source
OpenAI
GPT OSS 120b
05 Aug 2025
90.80%
Medium Reasoning Effort
Yes
Source
OpenAI
GPT OSS 20b
05 Aug 2025
84.90%
Low Reasoning Effort
Yes
Source