Individual benchmark scores plotted by date.
| Organisation | Model | Reported | Top Score | Info | Self Reported | Source |
|---|---|---|---|---|---|---|
| Claude Opus 4.6 | 05 Feb 2026 | 91.30 | LLM Stats (ZeroEval) | Yes | Source | |
| MiMo V2 TTS | 18 Mar 2026 | 86.70 | LLM Stats (ZeroEval) | inferred modality/version alias from mimo-v2-pro | Yes | Source | |
| MiMo V2 Pro | 18 Mar 2026 | 86.70 | LLM Stats (ZeroEval) | Yes | Source | |
| Seed 2.0 Pro | 14 Feb 2026 | 77.40 | Seed2 official benchmark table | DeepSearchQA | Yes | Source | |
| Kimi K2.5 | 27 Jan 2026 | 77.10 | LLM Stats (ZeroEval) | Yes | Source | |
| Seed 2.0 Lite | 14 Feb 2026 | 67.70 | Seed2 official benchmark table | DeepSearchQA | Yes | Source |