Frontier No More?

Five compounding forces are challenging the centrality of frontier models. Are we at the end of the "model dominance" era?

Vivek Ramaswami, Sabrina Albert, and Caleb Bushner

Jun 18, 2026

For the past two years, the outcome of the AI race had felt close to settled. A few labs owned the best models, the gap to everyone else looked like it was only widening, and the rest of the industry organized around a single anxious question: “what happens when OpenAI or Anthropic builds this?”

But in the past few weeks, the tide has been turning.

The largest labs no longer hold the lock they seemed to have on where AI's value would accrue, and the opening is visible on multiple simultaneous fronts.

Open-weight models have closed enough of the capability gap that raw model quality is no longer the deciding factor for a wide range of work. The economics that were supposed to funnel ever more spend toward the biggest models are breaking down as serious buyers stop sending every task to the most expensive option. The assumption that a frontier model is a dependable foundation took a direct hit when one of the most capable models in the world was switched off overnight by a government order. And the layer that determines whether an AI product actually works in production, the orchestration and harness around the model, is turning out to be both the hard part and the part the labs don't control.

All taken together, it reveals that the dominance of one or two frontier models is waning. And we are hearing this theme reverberate across the entire category on five dimensions:

Performance: Open-weight models have nearly closed the quality gap. Qwen3 matches Gemini 2.5 Pro on reasoning and coding. MiniMax and Kimi have hurtled to near-frontier capability, collapsing the old binary of “frontier = closed, open = worse”
Customer value: Enterprises have decided the most expensive model rarely justifies its cost. Large enterprises like Uber and Priceline are now setting up guardrails to cap token budgets and curb tokenmaxxing — a signal that the "throw the frontier at everything" era is ending. As Bill Gurley recently put it, "No one wants all their token usage on a single company — especially the pricey ones."
Evolving stack: More of the defensible value is starting to reside in the harness around the model, the orchestration, memory, and proprietary data, not the model itself. Satya Nadella recently argued why “AI ecosystems” will matter more than any single frontier model.
Regulatory pressure: Access to a closed model can be revoked overnight, and enterprise customers need to build systems on something more dependable. When Anthropic’s Fable and Mythos models went dark on June 12 under a government export order, the kill-switch risk became painfully evident. Fireworks founder Lin Qiao made the case for “owning vs. renting intelligence” - moving dependencies away from closed frontier APIs
Supply chain: Building frontier-scale compute keeps getting harder and more expensive, between scarce GPUs and local pushback on data centers. Especially as they prepare for the scrutiny of public markets, the model companies are tempering their infrastructure ambitions. For instance, OpenAI cut its infrastructure target from roughly $1.4 trillion to about $600 billion and shifted to renting compute it once planned to build.

Each of these indicate a broader pattern of frontier model hegemony being challenged on several critical fronts. This leads to several implications about how AI is built, bought, applied, and invested in.

Let's dig in.

Harnesses + Scaffolding > Model

The model is increasingly a commodity layer. What will differentiate products is the orchestration, memory, workflow logic, and proprietary data wrapped around it. This is what’s often referred to as the harness: the scaffolding that translates raw model capability into something that actually works in production.

To understand why harnesses exist, start with what models can’t do on their own. A raw model takes in text (or images, audio, video) and outputs text. That’s it. Out of the box, it cannot maintain durable state across sessions, execute code, access real-time knowledge, or set up the environment needed to complete a task. Every one of those limitations is a harness problem to solve. The model answers the question; the harness decides which questions to ask, in what order, with what context — and keeps the whole thing running when the model halts, hallucinates, or hits the edge of its context window.

A harness includes everything from how context and memory are managed across sessions, to how tools are called and outputs validated, to how the system handles edge cases mid execution. At its most concrete: system prompts, tool definitions, orchestration logic, memory management, and the middleware that enforces deterministic behavior around a non-deterministic model.

What a harness entails from “Agent Engineering”

In a broader sense, we think of this as taking advantage of your “organizational intelligence”: the accumulated knowledge, processes, and context that a company embeds into its AI stack over time. A security company that has processed ten thousand real incidents has organizational intelligence a general purpose model cannot replicate in a six week deployment. A legal AI that has learned the patterns of a specific court’s rulings, or a particular client’s contracting style, has built something genuinely proprietary. That’s the moat. The model is the engine; the harness is the vehicle.

Companies building model agnostic architectures today, where the underlying model can be swapped without rebuilding the product, are quietly making one of the most important bets in the space. Think of it like building on a database abstraction layer rather than hardcoding Postgres: the application logic stays intact as the underlying engine changes. The ones that don’t are one government directive, one pricing change, or one capability leap away from a rearchitecting crisis.

Even Nikesh Arora (CEO of Palo Alto Networks) is a proponent of harnesses!

Nikesh Arora@nikesharora

If you want fungibility across models, you need to build the harness, context, memory, intelligence and routing in the application layer. This allows you fungibility across models and compute location. Now @matanSF you just need to sell a secure instance of this :). Congrats on

9:50 PM · Jun 15, 2026 · 88.3K Views

4 Replies · 24 Reposts · 350 Likes

There is more than one frontier now

The relevant question for builders is no longer “are you on the frontier?” but “which frontier, and for what?”. There are several vectors for competing at the frontier: open-weights vs. closed-source; sovereign models vs. “tributaries”; owning your stack vs. renting your stack.

Open-weight models like GLM-5.2 (from Z.ai), Qwen, and MiniMax M3 are genuinely competitive for a wide range of production tasks. MiniMax M3 scores 59% on SWE-Bench Pro, edging out GPT-5.5 and ahead of Gemini 3.1 Pro, at roughly one-fifteenth the cost of closed frontier models. Qwen3's flagship 235B model benchmarks neck-and-neck with Gemini 2.5 Pro on reasoning and coding, with an Apache 2.0 license you can run on your own infrastructure. Neither has fully closed the gap to the top closed models — M3 trails Opus 4.8 meaningfully on abstract reasoning — but the gap is narrowing faster than most people expected, and for a wide range of production workloads, it's no longer the deciding factor.

Z.ai@Zai_org

Introducing GLM-5.2: Frontier Intelligence, Open Weights - Significant improvements in coding and agentic tasks - Strong long-horizon capabilities with a 1M context window - Two levels of reasoning effort: GLM-5.2 (max) pushes the limits, while GLM-5.2 (high) strikes a strong

5:40 PM · Jun 16, 2026 · 2.56M Views

423 Replies · 1.11K Reposts · 7.95K Likes

Matan Grinberg, founder/CEO of Factory, even stated recently that regarding software development tasks, he thinks open-weight models could “probably do 80-90% of what a frontier model can do” now. That’s a massive leap from even six months ago!

Sovereign AI is also getting a major boost in the wake of the Mythos/Fable shutdown. Politicians around the world, especially in Europe, are calling for increased funding in sovereign counterparts to Anthropic and OpenAI, optimized for local compliance and control requirements (like Mistral, rumored to be raising at a ~$23B valuation). Additionally, domain-specific fine-tuned models are outperforming general frontier models on narrow, high-value tasks. The practical implication: the model selection decision is now a product and business strategy decision, not just an engineering one. There are several real paths to glory — and the winner won’t necessarily be whoever is running the best general-purpose model.

Mistral AI CEO Arthur Mensch and Anthropic CEO Dario Amodei

Control of your destiny is a product decision

“Owning vs. renting intelligence” has moved from philosophical preference to risk management framework. The Fable shutdown made visceral what was previously theoretical: closed API access is a dependency with an external kill switch, one that can be pulled by a vendor repricing overnight, a competitive pivot upstream, or a government directive at 5pm on a Friday.

But the most durable form of control isn’t which model you run, it’s the data underneath it. Models are increasingly interchangeable; the proprietary data you’ve collected, cleaned, and structured is not. A startup that owns a high-quality, well-labeled dataset in its domain can fine-tune on whatever model is best this quarter and swap to a better one next quarter without losing what makes its product good. The company renting both the model and relying on generic public data has nothing that travels.

This is why the unglamorous work matters more than it looks: building the data pipelines, the labeling and filtering loops, the feedback mechanisms that turn raw usage into a proprietary training signal. The model is rented and replaceable. The data, and the systems that refine it, are owned and compounding. The question every builder should ask isn’t “which model is best,” it’s “what do I have that survives the model being swapped out from under me.”

Lin Qiao@lqiao

https://t.co/rMVRgBvwEI

6:14 AM · Jun 15, 2026 · 748K Views

49 Replies · 105 Reposts · 672 Likes

Token optimization is the next battleground

This is the one we’re watching most closely. The “tokenmaxxing” era — route everything through the most powerful frontier model and optimize later — is ending for serious enterprise deployments. Uber and Priceline setting hard token budgets isn’t a cost-cutting story; it’s a signal that enterprises are moving from experimentation to production discipline. Amazon and Microsoft are retreating from internal scoreboards measuring token use.

Why? Turns out using the most expensive frontier model for every task is not necessary.

So what does this unlock? Routing and orchestration layers that intelligently allocate tasks to the right model at the right cost become critical infrastructure (like Openrouter). Smaller, faster, purpose-built models get a second wind in use cases where they were previously passed over for expensive closed-source models. And startups that built for efficiency from the start — rather than bolting it on — have a structural advantage that the incumbents will be slow to match. In a world where inference cost is a board-level conversation, the intelligence stack gets a lot more interesting.

Conclusion

The architecture lesson here isn’t “avoid closed models” or even that we aren’t bullish on OpenAI / Anthropic (we care, and they will almost certainly be multi-trillion dollar companies in the coming years). It’s that for the first time in many years, it finally looks like the era of frontier model dominance is over, and the paths for how to build and scale an AI company have never been greater.

The companies that look most durable to us right now are the ones treating models as interchangeable components rather than foundations — building proprietary data, domain depth, and workflow logic that doesn’t break when the underlying model changes, gets patched, or disappears. Owning vs. renting your intelligence is another key component.

As VCs, we’ve never been more excited for the future of startups and the ability to build world-class companies.

Giddy up!

A guest post by

Caleb Bushner

Head of Marketing at Madrona. Big fan of startups.

Aspiring for Intelligence

Discussion about this post

Ready for more?