GPT-5.6 Review — Sol, Terra, and Luna in Preview
On June 26, 2026, OpenAI previewed GPT-5.6 — not as a single model but as a three-tier family: Sol, Terra, and Luna. Access at preview is limited, routed through Codex and the API for trusted partners, with general availability promised “in the coming weeks.” OpenAI’s chief scientist framed it as a “meaningful leap” over GPT-5.5, but the preview shipped with confirmed pricing and tier structure and, notably, no scores on the standard public benchmark suite. This review separates what is concrete from what is still framing.
TL;DR verdict
| GPT-5.6 | |
|---|---|
| Type | Frontier model family (three tiers) |
| Tiers | Sol (flagship), Terra (balanced), Luna (high-volume) — plus Sol Ultra effort mode |
| Context window | Unconfirmed at preview (GPT-5.5 shipped 1M; we use 1M for Sol) |
| Pricing | Sol $5/$30 · Terra $2.50/$15 · Luna $1/$6 per 1M |
| Published number | Terminal-Bench 2.1 91.9 (Sol Ultra) — outside the standard suite |
| Availability | Limited preview via Codex + API; GA pending |
| Best for | Teams already on GPT-5.5 planning a tiered migration |
| Caveat | No standard-suite benchmarks at preview; capability is anchored, not measured |
If you skip the rest: GPT-5.6’s pricing and tiering are concrete and competitive, but its capability story is still a preview promise. The tier split is the real news — OpenAI is matching GPT-5.5’s headline rate on the Sol flagship while opening cheaper Terra and Luna tiers below it. The honest asterisk is large: at preview OpenAI published only a Terminal-Bench result, so every standard-suite placement is interpolation until general availability.
What it is
GPT-5.6 is structured as three named tiers rather than one model with size suffixes:
- Sol — the frontier reasoning and long-horizon agentic flagship, aimed at coding, biology, and cybersecurity workloads. A compute-intensive high-effort variant, Sol Ultra, sits above it.
- Terra — a balanced everyday tier OpenAI positions as competitive with GPT-5.5 at roughly half the cost.
- Luna — the fastest and cheapest tier, built for high-volume routine tasks.
The family also introduces new reasoning modes, continuing the effort-tier pattern OpenAI established with the o-series and carried into GPT-5.5. The context window is not officially confirmed at preview; GPT-5.5 shipped a 1M-token window, and GPT-5.6 is expected to match it, with one unofficial report citing 1.5M. We record 1M for the Sol tier in the leaderboard until OpenAI publishes final specifications — the conservative choice when a number is unconfirmed.
What the preview actually published
This is where GPT-5.6 differs from a typical frontier launch: it arrived without the usual benchmark deck. OpenAI cited a single headline number — Terminal-Bench 2.1, where Sol Ultra leads at 91.9 — and that benchmark is outside the eight we track on the leaderboard. There are no GPT-5.6 scores at preview for SWE-bench, GPQA Diamond, MMLU-Pro, MATH-500, Aider Polyglot, or tau-bench.
For reference, the confirmed GPT-5.5 predecessor posted SWE-bench Pro 58.6, GPQA Diamond 93.6, and Terminal-Bench 2.0 82.7. The jump from 82.7 on Terminal-Bench 2.0 to Sol Ultra’s 91.9 on the 2.1 revision is the one measured signal of progress, and it is a real one on agentic, terminal-driven coding — but it is a single benchmark, on the top-effort variant, on a revised test. That is not enough to rank GPT-5.6 against the field on its own.
Because of that, in our models leaderboard the GPT-5.6 Sol row carries conservative estimates anchored to GPT-5.5 — each standard-suite cell nudged up one notch consistent with OpenAI’s “meaningful leap” framing and the Terminal-Bench gain, and deliberately kept below the GPT-5.5 Pro heavy-compute tier so the estimate does not overstate a preview. Those cells are flagged as estimates and will be replaced the moment OpenAI or a third party publishes real numbers.
What it costs
OpenAI confirmed per-tier API pricing at preview, per 1M tokens:
| Tier | Input | Output | Positioning |
|---|---|---|---|
| Sol | $5.00 | $30.00 | Frontier flagship — matches GPT-5.5’s headline rate |
| Terra | $2.50 | $15.00 | Balanced everyday tier, ~half of Sol |
| Luna | $1.00 | $6.00 | High-volume, lowest cost |
The structure matters more than any single number. By holding Sol at GPT-5.5’s $5/$30 and adding Terra and Luna below it, OpenAI is making the same move the rest of the field has converged on — a flagship plus cheaper tiers under one family — rather than a single price-bump generation. For most production traffic that does not need the frontier tier, Terra and Luna are the lines to watch, and the leaderboard cost calculator lets you plug your own token mix against GPT-5.5 and the open tier to see where the tier split actually pays off.
How it compares
GPT-5.6’s real competition at the top is Claude Opus 4.8 and the Mythos-class Claude Fable 5, both of which already have published benchmark cards — an advantage GPT-5.6 lacks at preview. On the open side, GLM-5.2 and the open coding tier keep undercutting closed pricing six-to-one on agentic coding, which is exactly the pressure the cheaper Luna tier is built to answer. Against its own predecessor, GPT-5.5, the Sol flagship is a same-price, hopefully-better swap, while Terra is the more interesting line for teams that found GPT-5.5 capable but expensive.
The pattern worth noting: GPT-5.6 lands in the same crowded late-June window as Google’s delayed Gemini 3.5 Pro (now slated for July) and a wave of Chinese frontier refreshes. OpenAI shipping a preview with pricing but without a full benchmark deck reads as a move to plant a flag in that window — useful to plan around, but not yet a model you can rank with confidence.
Who should care
- Teams already on GPT-5.5: This is the headline audience. The tier structure (Sol/Terra/Luna) and confirmed pricing let you plan a migration now, but wait for general availability and published evals before moving a production workload — and A/B the specific tier you would use.
- Cost-sensitive, high-volume pipelines: Watch Luna ($1/$6) and Terra ($2.50/$15). If GPT-5.5’s headline rate was the blocker, the cheaper tiers may change the math — confirm capability on your task once GA opens. See multi-agent pipelines for where a cheap tier slots into a larger workflow.
- Agentic and terminal-driven coding teams: The one published signal — Sol Ultra leading Terminal-Bench 2.1 — is directly relevant. If your work is long-horizon, tool-using coding, GPT-5.6 Sol is worth a preview slot if you can get one.
- Anyone ranking models today: Hold off. With no standard-suite numbers at preview, GPT-5.6’s leaderboard position is an estimate. The LLM Benchmark Comparison 2026 covers how to weigh a model that ships pricing before proof.
FAQ
What is GPT-5.6? OpenAI’s next-generation family, previewed June 26, 2026, in three tiers — Sol (flagship), Terra (balanced), and Luna (high-volume) — plus a Sol Ultra high-effort variant. Access at preview is limited via Codex and the API, with general availability pending.
How much does it cost? Per 1M tokens: Sol $5 input / $30 output, Terra $2.50 / $15, Luna $1 / $6. Sol matches GPT-5.5’s headline rate; Terra and Luna sit below it.
Does it have published benchmarks? Not on the standard suite at preview. OpenAI cited only Terminal-Bench 2.1 (Sol Ultra 91.9). Our leaderboard cells for GPT-5.6 Sol are estimates anchored to GPT-5.5 until real numbers land.
What’s the context window? Unconfirmed at preview. GPT-5.5 shipped 1M tokens and GPT-5.6 is expected to match; we record 1M for Sol pending official specs.
Should I switch from GPT-5.5 now? Not for production. It is a limited preview with no published standard benchmarks. Plan around the confirmed pricing and tiers, but wait for GA and independent evals before migrating.
Continue reading
- AI Models Leaderboard — GPT-5.6 Sol versus 60+ models on benchmarks, pricing, and context window, with a cost calculator.
- GLM-5.2 Review — the open coding flagship whose six-to-one pricing is the pressure GPT-5.6’s Luna tier answers.
- Claude Opus 4.8 Review — the closed frontier competitor that, unlike GPT-5.6, shipped with a full benchmark card.
- LLM Benchmark Comparison 2026 — how to read self-reported numbers, and a model that ships pricing before proof.
- All Reviews — index of every head-to-head review on the site.