Open vs Closed AI

How Big Is the Gap on Each Benchmark?

Every text model we track, sorted lowest to highest score on the benchmark you pick. Open-source models in green, closed-source in teal. See who is ahead and watch the gap close as new open weights ship.

Last updated: June 6, 2026

Pick a benchmark

Open SourceClosed Source

Gap on GPQA:+3pts closed leads

Top Open Source

1Kimi K2.691.1
2DeepSeek-V4-Flash89.4
3Qwen3.5-397B-A17B89.3

Top Closed Source

1Gemini 3.1 Pro Preview94.1
2GPT-5.593.5
3Qwen 3.7 Max92.3

China vs US

China vs US Across All Benchmarks

The same models, plotted across every 0 to 100 benchmark we track. Each dot is a flag for the country the model was built in. Use it to spot where the gap is widest, where it has already closed, and which benchmarks the two ecosystems trade leadership on.

ChinaUnited States

What the Open vs Closed AI Gap Tracker Shows

The open vs closed AI gap is the score difference between the best open-weight AI model and the best closed-source flagship on a given benchmark. A small gap means open models have caught up; a large gap means a frontier-class capability is still only available behind a paid API. This tracker plots that gap on every text benchmark we cover, sorted lowest to highest so the smallest gaps surface first.

Every text-generation model in the directory is plotted as a single dot. Open-source models in green, closed-source flagships in teal. The chart sorts ascending so you can read the catch-up curve at a glance and watch the gap close in real time as new open weights ship. Pick any benchmark from the selector to swap the view: GPQA, MMLU-PRO, GSM8K, SWE-Verified, HLE, AIME, Terminal Bench, and more.

A separate country chart breaks the same data down by lab country of origin (US-headquartered vs China-headquartered) so you can see who is leading the open frontier and where the closed-source advantage is narrowest. Both charts are exportable as PNG with a watermark linking back to the page so attribution stays intact when shared on social.

Benchmarks covered: 15+
Models tracked: 150+
Score range: 0–100
Update cadence: On model release

AI model rankings and benchmark comparison dashboard.

See the Models Behind the Gap

Browse Every Tracked AI Model

The gap chart shows the score spread. The full model rankings show every open and closed model that contributed those dots, with hardware fit, VRAM at Q4, and side-by-side comparison.

Open Model Rankings

Related AI Model Tools

Frequently Asked Questions

What Is the Open vs Closed AI Gap Tracker?

It is a free dot-chart tool that plots every text-generation AI model we follow against a single benchmark, with open-source models colored green and closed-source flagships colored teal. The chart sorts ascending so the score gap between open weights and closed APIs is easy to read at a glance, on every benchmark we track.

How Is a Model Classified as Open or Closed Source?

Each model in our reference database carries an isClosedSource flag. Closed-source means the weights are not publicly distributed and the model is only accessible through an API or first-party product, like GPT-5, Claude Opus, or Gemini. Open-source means the weights are downloadable and runnable locally, like Llama, Qwen, Mistral, and DeepSeek releases.

Where Do the Benchmark Scores Come From?

Scores come from the official leaderboards or papers for each benchmark. LM Arena is synced automatically. Academic and research benchmarks like GPQA, MMLU-PRO, GSM8K, SWE-bench, HLE, AIME, HMMT, Terminal Bench, EvasionBench, and olmOCR are entered into our reference model database when vendors or labs publish a number, then refreshed when scores update. Click a benchmark in the explainer to open the source dataset.

Why Are Some Benchmarks Missing Models?

A model only appears on a benchmark if the lab or vendor has published a score for it. Frontier labs report different subsets, and many open-source teams skip the niche benchmarks. The chart hides any benchmark with fewer than four scoring models so the visualization stays meaningful instead of showing two lonely dots.

How Often Does the Data Update?

LM Arena scores sync automatically on a recurring schedule, so the LM Arena chart reflects the public leaderboard within hours of a refresh. Other benchmarks update whenever a vendor or research group publishes a new result and our team enters it into the reference database. The Last Updated label in the hero shows the latest sync timestamp across all text benchmarks.

Can I Share or Download the Charts?

Yes. Use the Share button in the hero to copy the page URL and post it anywhere. To save a specific benchmark as a still image, use the Download as Image button next to the chart. The exported PNG includes the benchmark title, the open and closed source legend, the dot chart, and a watermark linking back to the page so attribution stays intact.

Why Does the Chart Sort Ascending?

Sorting from lowest to highest score makes the gap between open-source and closed-source models pop out visually. The eye reads the chart left to right and watches green dots climb. The longer the closed-source teal cluster sits at the top of the chart with no green nearby, the bigger the gap on that benchmark, and the more interesting the next open release becomes.

Need Help Picking a Model?

We help teams ship the right open or closed model for the job, and the hardware to run it on.

How Big Is the Gap on Each Benchmark?

China vs US Across All Benchmarks

What the Open vs Closed AI Gap Tracker Shows

Browse Every Tracked AI Model

Related AI Model Tools

AI model treemap

AI benchmarks library

Live AI API price tracker

AI hardware calculator

Frequently Asked Questions

Need Help Picking a Model?