Loading blog posts...
Loading blog posts...
Loading...

OpenAI just dropped its most restricted model launch yet, and the naming scheme finally feels straightforward. GPT-5.6 introduces Sol, Terra, and Luna: three clear capability tiers where the number marks the generation and the name tells you what you’re buying. Here’s what each variant actually does well, and how to pick the right one for your workloads.
Sol is the top tier: frontier reasoning, long-horizon agentic work, and advanced use cases in coding, science, and cybersecurity. Terra is the middle tier, aiming for GPT-5.5-level performance at about half the cost. Luna is built for high-volume jobs where speed and price matter more than peak capability.
The big win here is clarity. It replaces confusing suffixes like "Instant" or "Pro" from earlier releases. And it’s meant to stick: future generations should bump the number while keeping the same tier names.
| Model | Best For | Input Cost (per 1M tokens) | Output Cost (per 1M tokens) |
|---|---|---|---|
| Sol | Frontier reasoning, agentic work, cybersecurity, biology | $5.00 | $30.00 |
| Terra | Everyday enterprise tasks, knowledge work | $2.50 | $15.00 |
| Luna | High-volume classification, extraction, routing | $1.00 | $6.00 |
Sol matches GPT-5.5 pricing while delivering what OpenAI describes as a "step-function improvement in capability." Terra comes in at 50% of Sol’s cost, and Luna drops to 20%. The key insight: you can route work by need and keep spend under control without guessing.

Sol introduces two operating modes that point to a real shift in how frontier models are expected to behave. "Max" reasoning effort pushes a single model harder. "Ultra" goes a step further by spinning up subagents and coordinating them to complete complex tasks.
This isn’t really about nicer chat replies anymore. The focus is planning, iteration, tool use, and agent coordination. Vendor-reported Terminal-Bench 2.1 scores reflect that: Sol Ultra hits 91.9%, and standard Sol reaches 88.8%. These results still need independent replication, but they suggest meaningful gains for terminal-based coding and agent-style workflows.
Important
[!IMPORTANT] Sol can be more persistent than predecessors and may exceed user intent in coding-agent contexts. Implement human approval gates, logging, sandboxing, and cost controls for any agentic deployment.
On cybersecurity, the capabilities are strong enough that OpenAI is restricting access. External evaluator Irregular reported Sol solved 19 of 197 FrontierCyber challenges: 11% Easy, 12% Medium, 5% Hard, 0% Elite. GPT-5.5 scored 6%, 6%, 4%, 0% on the same set. Sol can spot vulnerabilities and exploitation primitives, although it didn’t autonomously produce functional full-chain exploits against hardened browser-style targets in testing.
Terra is aimed at the broadest set of use cases: everyday enterprise knowledge work. It’s positioned as GPT-5.5-competitive, which typically means most teams won’t notice much of a drop for standard tasks like document analysis, email drafting, meeting summaries, and general Q&A.
That 50% cost reduction versus Sol is what makes Terra the default for many orgs. For example, a company processing 10 million tokens daily would save roughly $25 on input costs and $150 on output costs: $175 per day, or over $5,000 per month.
Tip
[!TIP] Start with Terra for new projects. Upgrade specific workflows to Sol only when you hit capability limits. That keeps costs down while showing exactly where frontier reasoning is actually paying off.
Terra can still handle multi-step reasoning, code generation, and analysis competently. Where Sol tends to pull ahead is in longer agentic workflows, tougher debugging sessions, and specialized domains like advanced biology or cybersecurity research.
Luna is for workloads where latency and cost beat depth. Tasks like classification, extraction, summarization, and routing rarely need frontier reasoning. They need fast, cheap, consistent output at scale.
At $1 input and $6 output per million tokens, Luna opens up use cases that were often too expensive before. High-volume document processing, real-time content moderation, and API-first applications are usually the best fit.
| Use Case | Recommended Tier | Why |
|---|---|---|
| Complex codebase debugging | Sol | Requires deep reasoning and multi-step analysis |
| Customer support automation | Terra | Balances quality with cost for conversational tasks |
| Log analysis and alerting | Luna | High volume, pattern matching, speed matters |
| Scientific literature review | Sol | Benefits from frontier reasoning in specialized domains |
| Email triage and routing | Luna | Simple classification at scale |
| Contract analysis | Terra | Needs good comprehension without frontier costs |
The right tier won’t always be obvious upfront. A common pattern is to run Luna for initial classification, then send the harder cases to Terra or Sol. That cascade tends to keep costs low without sacrificing quality where it matters.
SecureBio evaluations show noticeable gains in life sciences. Sol scores 68.3% on World-Class Bio benchmarks, about 9 points above GPT-5.5’s 59.7%. The domain breakdown is consistently higher as well:
That helps explain the restricted rollout. The GPT-5.6 System Card labels all three models as "High" capability in Biological/Chemical risk under OpenAI's Preparedness Framework, though none hit the "Critical" threshold that would trigger even tighter deployment limits.
If your team works in biology, chemistry, or adjacent research areas, Sol is a real capability jump. In practice, that usually means stronger literature synthesis, better hypothesis generation, and more useful help with experimental design. Teams considering local LLM alternatives for sensitive work should weigh these benchmark gains against privacy and data-handling requirements.

This is OpenAI’s most restrictive launch so far. The preview is limited to roughly 20 partner organizations, and access is API and Codex only. GPT-5.6 isn’t in ChatGPT during the preview. There’s also no public application or waitlist.
The June 2026 executive order on frontier AI sets up a voluntary framework for frontier models, including classified cyber benchmarking and up to 30-day government access before public release. OpenAI’s rollout hints at what’s next for top-tier models: capability tiers, heavier safety review, and more coordination with government processes.
Note
[!NOTE] General availability for all three tiers is expected "in the coming weeks" according to OpenAI, though no specific date has been announced as of July 4, 2026.
The safety stack here is OpenAI’s most extensive yet. It includes model-level safety training, real-time cyber and biology misuse classifiers, account-level monitoring, differentiated access controls, and ongoing red-teaming. OpenAI reports over 700,000 A100-equivalent GPU hours spent on automated red-teaming for universal jailbreaks alone.
Deployment simulations project 8.6 harassment-policy violations per 100,000 conversation turns for Sol. Sexual disallowed content ticked up slightly from 0.05% to 0.07% versus GPT-5.5, while mental-health disallowed responses dropped from 0.03% to 0.02%.
The three-tier system makes cost control much more practical. Instead of defaulting to the strongest model, teams can route requests based on what the task actually needs.
Sol workloads: Complex multi-step reasoning, codebase debugging across large repositories, cybersecurity defense and vulnerability analysis, scientific research requiring domain expertise, agentic workflows with tool use and iteration.
Terra workloads: General enterprise knowledge work, document analysis and synthesis, code generation for standard tasks, customer-facing applications requiring quality responses, internal tools and assistants.
Luna workloads: High-volume classification and tagging, content extraction and summarization, request routing and triage, real-time moderation, any task where throughput matters more than depth.
Warning
[!WARNING] Don’t route cybersecurity or biology-related queries to Luna or Terra. These areas benefit a lot from Sol’s specialized training, and a capability gap can lead to incomplete or misleading outputs in high-stakes contexts.
What’s often missed: cascading can be the sweet spot. Luna handles the first pass, Terra covers the “normal” cases, and Sol takes the hard edge cases. It’s more work architecturally, but many teams can cut costs by 60-80% versus sending everything to Sol.

If your organization expects access, it’s smart to get ready now. The preview won’t last forever, and GA will probably come with a rush of demand.
API integration: Review the OpenAI Help Center documentation for model identifiers and API parameters. The new reasoning effort levels ("max" and "ultra") need explicit configuration.
Cost modeling: Build projections from the published pricing. Map which workloads belong on Sol, Terra, or Luna. The 5x output-token price gap between Sol and Luna adds up fast at scale.
Safety infrastructure: If you’re building agentic workflows, put the controls in place that OpenAI calls out: human approval gates for consequential actions, detailed logging, sandboxed execution, and spending limits.
Policy review: This rollout is a signal that frontier deployments are getting more scrutiny. Regulated teams should line up compliance requirements before moving GPT-5.6 into production.
Start here (your first step)
Review your current OpenAI API usage and categorize each workflow as Sol-appropriate, Terra-appropriate, or Luna-appropriate using the routing guidance above.
Quick wins (immediate impact)
Deep dive (for those who want more)
GPT-5.6’s three-tier system makes model selection much easier. Sol is the choice for demanding, high-stakes work that benefits from frontier reasoning. Terra is the practical default for most enterprise tasks at about half the cost. Luna is the budget-friendly option for high-volume processing where speed matters most.
The restricted rollout also points to where frontier AI releases are heading: tiered by capability, gated by safety review, and increasingly coordinated with government frameworks. Teams that get their routing strategy and supervision controls ready now will be in a strong position when general availability lands.