Capabilities, modalities, and lifecycle fields pulled from the model database.
Comparative results across benchmarks shared by the selected models.
| MMMLU | 67.0% |
| SWE-Bench | 37.4% |
| GPQA Diamond | 56.8% |
| MMLU | 80.4% |
| Tau Bench (Airline) | 32.0% |
| Codeforces | 125100.0% |
| HealthBench | 40.4% |
| EQ-Bench 3 | 80020.0% |
| Tau Bench (Retail) | 35.0% |
| AIME 2024 | 42.1% |
| HealthBench Hard | 9.0% |
| Humanity's Last Exam | 4.2% |
| HealthBench Concensus | 82.6% |
| AIME 2025 | 37.1% |
| Aider-Polyglot | 16.6% |
Observed provider pricing per million tokens.
All unique meters observed across the selected models.
| Meter | GPT OSS 20b |
|---|---|
| Input Text Tokens | $0.04 |
Providers that expose each model based on observed pricing data.
Maximum input and output token capacity.
Usage and distribution terms.
Model release chronology.
Most recent training data date (when available).
A deeper field-by-field view (including benchmarks, pricing, and links).
| General Information | |
| Context Window | Input: 131,072 Output: 131,072 |
| Modalities | In: Text Out: Text |
| Reasoning | - |
| Web access | - |
| Parameters | - |
| Training Tokens | - |
| License | Apache 2.0 |
| Knowledge Cutoff | Jun 2024 |
| Status | Available |
| Release | Aug 2025 |
| Announced | Aug 2025 |
| Deprecation | - |
| Retirement | - |
| Links | |
| Operational Metrics | |
| Cost per 1M Tokens | Input: $0.04 Output: $0.15 |
| Latency | - |
| Throughput | - |
| Benchmarks | |
| AIME 2024 | |
| AIME 2025 | |
| Aider-Polyglot | |
| Codeforces | |
| EQ-Bench 3 | |
| GPQA Diamond | |
| HealthBench | |
| HealthBench Concensus | |
| HealthBench Hard | |
| Humanity's Last Exam | |
| MMLU | |
| MMMLU | |
| SWE-Bench | |
| Tau Bench (Airline) | |
| Tau Bench (Retail) | |