Individual benchmark scores plotted by date.
| Organisation | Model | Reported | Top Score | Info | Self Reported | Source |
|---|---|---|---|---|---|---|
| Gemini 2.5 Pro Preview TTS (2025-12-10) | 10 Dec 2025 | 75.60% | inferred modality/version alias from gemini-2.5-pro | Yes | Source | |
| Gemini Embedding 2 Preview | 10 Mar 2026 | 75.60% | manual fallback alias from gemini-2.5-pro | Yes | Source | |
| Gemini 2.5 Computer Use Preview | 07 Oct 2025 | 75.60% | inferred family alias from gemini-2.5-pro (score=0.3960; benches=16) | Yes | Source | |
| Gemini 2.5 Pro Experimental (2025-03-25) | 25 Mar 2025 | 75.60% | inferred alias from gemini-2.5-pro | Yes | Source | |
| EXAONE 4.0 32B | 15 Jul 2025 | 72.60% | Reasoning | Yes | Source | |
| Gemini 2.5 Flash Preview TTS (2025-12-10) | 10 Dec 2025 | 63.90% | inferred modality/version alias from gemini-2.5-flash | Yes | Source | |
| Gemini Live 2.5 Flash Preview | 09 Apr 2025 | 63.90% | inferred high-confidence family alias from gemini-2.5-flash (score=0.5083; benches=14) | Yes | Source | |
| Gemini 2.5 Flash Image (Nano Banana) | 02 Oct 2025 | 63.90% | inferred modality/version alias from gemini-2.5-flash | Yes | Source | |
| Gemini 2.5 Flash Preview (2025-09-25) | 25 Sept 2025 | 63.90% | inferred alias from gemini-2.5-flash | Yes | Source | |
| Gemini 2.5 Flash Exp Native Audio Thinking Dialog | - | 63.90% | inferred modality/version alias from gemini-2.5-flash | Yes | Source | |
| Gemini 2.5 Flash Preview Native Audio Dialog | - | 63.90% | inferred modality/version alias from gemini-2.5-flash | Yes | Source | |
| Gemini 2.5 Flash Native Audio Preview (2025-09-23) | - | 63.90% | inferred modality/version alias from gemini-2.5-flash | Yes | Source | |
| Gemini 2.5 Flash Preview TTS (2025-05-20) | 20 May 2025 | 63.90% | inferred modality/version alias from gemini-2.5-flash | Yes | Source | |
| Gemini 2.5 Flash Image Preview (Nano Banana) | 25 Aug 2025 | 63.90% | inferred modality/version alias from gemini-2.5-flash | Yes | Source | |
| Qwen 3 VL 235B A22B Instruct | - | 61.40% | - | Yes | - | |
| EXAONE 4.0 1.2B | 15 Jul 2025 | 44.60% | Reasoning | Yes | Source | |
| Mistral Large 3.0 | 02 Dec 2025 | 34.40% | No Reasoning | Yes | Source | |
| Gemini 2.0 Flash Lite | 05 Feb 2025 | 28.90% | - | Yes | Source |