CharXiv-Reasoning

CharXiv-Reasoning - Benchmark Leaderboard & Model Performance | AI Stats

Organisation	Model	Reported	Top Score	Info	Self Reported	Source
Anthropic	Claude Opus 4.7	16 Apr 2026	91%	With tools	Yes	Source
Meta	Muse Spark	08 Apr 2026	86.40%	Figure understanding	Yes	Source
Google	Gemini 3.5 Flash	19 May 2026	84.20%	-	Yes	Source
Google	Gemini 3 Pro Preview	18 Nov 2025	81.40%	-	Yes	Source
OpenAI	GPT 5	07 Aug 2025	81.10%	With Thinking, Pass @ 1	Yes	Source
OpenAI	GPT 5 Mini	07 Aug 2025	75.50%	High Reasoning Effort	Yes	Source
Google	Gemini 3.1 Flash Lite Preview	03 Mar 2026	73.20%	-	Yes	Source
Google	Gemini 3.1 Flash-Lite	07 May 2026	73.20%	-	Yes	Source
OpenAI	GPT 5 Nano	07 Aug 2025	62.70%	High Reasoning Effort	Yes	Source
Cohere	Command A+	20 May 2026	52.70%	-	Yes	Source