Individual benchmark scores plotted by date.
| Organisation | Model | Reported | Top Score | Info | Self Reported | Source |
|---|---|---|---|---|---|---|
| Gemini 1.5 Pro Exp (2024-08-01) | 01 Aug 2024 | 89.20 | LLM Stats (ZeroEval) | inferred alias from gemini-1.5-pro | Yes | Source | |
| Gemini 1.5 Pro 001 | 23 May 2024 | 89.20 | LLM Stats (ZeroEval) | inferred alias from gemini-1.5-pro | Yes | Source | |
| Gemini Robotics ER 1.5 Preview | 25 Sept 2025 | 89.20 | LLM Stats (ZeroEval) | inferred family alias from gemini-1.5-pro (score=0.3717; benches=23) | Yes | Source | |
| Gemini 1.5 Pro Exp (2024-08-27) | 27 Aug 2024 | 89.20 | LLM Stats (ZeroEval) | inferred alias from gemini-1.5-pro | Yes | Source | |
| LearnLM 1.5 Pro Experimental | 19 Nov 2024 | 89.20 | LLM Stats (ZeroEval) | inferred family alias from gemini-1.5-pro (score=0.3700; benches=23) | Yes | Source | |
| Gemini 1.5 Flash 001 | 23 May 2024 | 85.50 | LLM Stats (ZeroEval) | inferred alias from gemini-1.5-flash | Yes | Source | |
| Gemini 1.5 Flash Preview | 14 May 2024 | 85.50 | LLM Stats (ZeroEval) | inferred alias from gemini-1.5-flash | Yes | Source | |
| Phi 3.5 MoE instruct | 23 Aug 2024 | 79.10 | LLM Stats (ZeroEval) | Yes | Source | |
| Phi 4 Mini | 01 Feb 2025 | 70.40 | LLM Stats (ZeroEval) | Yes | Source | |
| Granite 3.2 8B Instruct | - | 69.13 | LLM Stats (ZeroEval) | inferred high-confidence family alias from granite-3.3-8b-instruct (score=0.4911; benches=14) | Yes | Source | |
| Granite 3.2 8B Instruct Preview | - | 69.13 | LLM Stats (ZeroEval) | inferred high-confidence family alias from granite-3.3-8b-instruct (score=0.4687; benches=14) | Yes | Source | |
| Granite Guardian 3.0 8B | - | 69.13 | LLM Stats (ZeroEval) | inferred family alias from granite-3.3-8b-instruct (score=0.4062; benches=14) | Yes | Source | |
| Granite 3.3 8B Instruct | 16 Apr 2025 | 69.13 | LLM Stats (ZeroEval) | Yes | Source | |
| Granite Guardian 3.1 8B | - | 69.13 | LLM Stats (ZeroEval) | inferred family alias from granite-3.3-8b-instruct (score=0.4062; benches=14) | Yes | Source | |
| Granite Guardian 3.3 8B | - | 69.13 | LLM Stats (ZeroEval) | inferred high-confidence family alias from granite-3.3-8b-instruct (score=0.5071; benches=14) | Yes | Source | |
| Granite Speech 3.2 8B | - | 69.13 | LLM Stats (ZeroEval) | inferred family alias from granite-3.3-8b-instruct (score=0.4062; benches=14) | Yes | Source | |
| Granite 3.0 8B Instruct | - | 69.13 | LLM Stats (ZeroEval) | inferred high-confidence family alias from granite-3.3-8b-instruct (score=0.4911; benches=14) | Yes | Source | |
| Granite Speech 3.3 8B | - | 69.13 | LLM Stats (ZeroEval) | inferred high-confidence family alias from granite-3.3-8b-instruct (score=0.5071; benches=14) | Yes | Source | |
| Granite 3.1 8B Instruct | - | 69.13 | LLM Stats (ZeroEval) | inferred high-confidence family alias from granite-3.3-8b-instruct (score=0.4911; benches=14) | Yes | Source | |
| Granite 3.3 2B Instruct | 16 Apr 2025 | 69.13 | LLM Stats (ZeroEval) | inferred family alias from granite-3.3-8b-instruct (score=0.3627; benches=14) | Yes | Source | |
| Phi 3 Mini 128K Instruct | - | 69.00 | LLM Stats (ZeroEval) | inferred family alias from phi-3.5-mini-instruct (score=0.3533; benches=31) | Yes | Source | |
| Phi 3.5 mini instruct | 23 Aug 2024 | 69.00 | LLM Stats (ZeroEval) | Yes | Source | |
| Granite 4.0 Tiny Preview | 02 May 2025 | 55.70 | LLM Stats (ZeroEval) | Yes | Source | |
| Granite 4.0 Small | 02 Oct 2025 | 55.70 | LLM Stats (ZeroEval) | inferred high-confidence family alias from granite-4.0-tiny-preview (score=0.4700; benches=12) | Yes | Source | |
| Granite 4.0 Micro | 02 Oct 2025 | 55.70 | LLM Stats (ZeroEval) | inferred high-confidence family alias from granite-4.0-tiny-preview (score=0.4700; benches=12) | Yes | Source | |
| Granite 4.0 Tiny | 02 Oct 2025 | 55.70 | LLM Stats (ZeroEval) | inferred alias from granite-4.0-tiny-preview | Yes | Source | |
| Gemma 3n E4B | 25 Jun 2025 | 52.90 | LLM Stats (ZeroEval) | Yes | Source | |
| Gemma 3n E2B | 25 Jun 2025 | 44.30 | LLM Stats (ZeroEval) | Yes | Source |