Individual benchmark scores plotted by date.
| Organisation | Model | Reported | Top Score | Info | Self Reported | Source |
|---|---|---|---|---|---|---|
| Grok 3 | 18 Apr 2025 | 0.83 | - | Yes | Source | |
| Grok 3 Mini | 18 Apr 2025 | 0.83 | High Reasoning Effort | Yes | Source | |
| EXAONE 4.0 32B | 15 Jul 2025 | 0.82 | Reasoning | Yes | Source | |
| Nova 2 Pro | 02 Dec 2025 | 0.82 | - | Yes | Source | |
| Nova 2 Lite | 02 Dec 2025 | 0.81 | - | Yes | Source | |
| Grok 3 Beta | 19 Feb 2025 | 0.80 | - | Yes | Source | |
| Grok 3 Mini Beta | 19 Feb 2025 | 0.79 | - | Yes | Source | |
| Mistral Small 3.2 | 20 Jun 2025 | 0.69 | 5 Shot CoT | Yes | Source | |
| EXAONE 4.0 1.2B | 15 Jul 2025 | 0.59 | Reasoning | Yes | Source |