Skip to content
All writing
AI StrategyInfrastructureEfficiency

Sora's Shutdown and the Efficiency Story Told in Reverse

OpenAI killed Sora because $15 million a day in inference costs dwarfed $2.1 million in lifetime revenue. It is the strongest case study yet for why delivery economics — not capability — decides what survives.

Jake Chen··5 min read

Personal perspectives only — does not represent the views of my employer.

On Monday, OpenAI shut down Sora.

The company's video generation app lasted roughly six months. It launched in late September 2025, peaked at 3.3 million downloads in November, declined 66 percent by February, and was dead by March 24.

The stated reason was a pivot toward robotics. The real reason was economics.

Forbes estimated Sora's peak inference costs at approximately $15 million per day, based on analysis assuming roughly $1.30 per 10-second video clip. In its lifetime, Appfigures estimates Sora made about $2.1 million in total revenue from in-app purchases. That means the entire lifetime revenue of the product covered roughly 3.4 hours of inference costs at peak.

Interactive

The Sora Economics

Click any metric to see the full picture behind the shutdown.

Click a metric above to see the story behind it.

This is the efficiency story told in reverse.

Capability without delivery economics is a dead end

I have spent the last few posts arguing that the next phase of AI will be defined less by intelligence gains and more by gains in the economics of delivering intelligence. TurboQuant attacks memory costs. ChatJimmy attacks hardware costs. Both point toward a future where inference is cheap enough to embed everywhere.

Sora is the counter-case. It is what happens when you build a genuinely impressive capability and do not solve the delivery economics.

The model was not the problem. Sora could generate compelling video. The problem was that generating compelling video cost more than anyone was willing to pay. The product died not because the intelligence was insufficient, but because the infrastructure to deliver it affordably did not exist.

$5.4B

Annualized inference cost

At $15 million per day, Sora's estimated annual inference bill approached $5.4 billion — for a product that generated $2.1 million in total lifetime revenue.

This is not a niche failure. It is a structural lesson.

The demo trap, revisited

I wrote earlier about the demo trap — the pattern where AI products generate enormous initial excitement but fail to convert that excitement into retention. Sora is a textbook case.

One million downloads in five days. That is not a demand problem. People wanted to make AI videos. They wanted it enough to download the app, try it, and tell their friends.

But downloads are not usage. And usage is not a business. Monthly downloads fell 66 percent from November to February. The wow wore off. The use case did not stick.

Part of the issue was product-market fit. Video generation is still a tool looking for a daily workflow. But the deeper issue was cost structure. Even if retention had been strong, the economics were upside down. More usage would have meant more losses. Growth was the enemy.

That is the signature of a product where capability outran delivery.

Disney and the collapse of downstream confidence

The Disney deal makes the story more interesting.

Disney had agreed to license hundreds of name-brand characters for Sora-powered virtual avatars, reportedly with a $1 billion investment on the table. That deal collapsed with the shutdown.

This matters because it shows the second-order effects of unresolved delivery economics. It is not just that Sora lost users. It is that the entire ecosystem of partnerships, integrations, and revenue streams that was forming around the capability evaporated the moment the infrastructure economics proved unworkable.

Capability attracts partners. Sustainable economics keeps them.

What this means for the efficiency thesis

Sora's shutdown strengthens the argument I have been making in this series.

If you accept that AI's next phase is about delivery economics — about making intelligence cheap, fast, and structurally deployable — then Sora is the clearest warning of what happens when that economics layer is missing.

The TurboQuant story is about making memory 6x cheaper. The ChatJimmy story is about making inference 10x faster. The Sora story is about what happens when you skip that step entirely: you build a product that is too expensive to survive its own users.

The lesson is not that video generation is doomed. Other companies will solve the economics. The lesson is that capability without an efficiency stack is a research demo, not a business.

The cost of remembering and the cost of generating

There is one more connection worth drawing.

In my piece on TurboQuant, I argued that a lot of enterprise AI today is really scarcity management wearing the costume of elegance — we summarize because we cannot afford to remember, we prune because context is expensive, we build brittle retrieval layers because memory is scarce.

Sora had the opposite version of the same problem. It was not managing scarcity. It was drowning in abundance — generating outputs that were too expensive to sustain.

Both problems point to the same insight: the gap between what AI can do and what AI can afford to do is the most important gap in the industry right now. Closing it from the memory side (TurboQuant) and from the hardware side (ChatJimmy) are two paths toward the same destination. Failing to close it (Sora) is how promising products die.

The efficiency story keeps getting louder

Sam Altman told his staff that the Sora research team would continue to focus on world simulation research for robotics. That may well be the right long-term bet. But the short-term message is clear: even OpenAI, with more capital than almost anyone in AI, could not sustain a product where the inference economics were wrong.

If that is true for OpenAI, it is true for everyone.

The companies that win the next phase of AI will not just be the ones that build the most impressive capabilities. They will be the ones that solve the boring problems of cost, latency, memory, and deployment well enough that their capabilities can actually survive contact with real users at real scale.

That is the efficiency story. And Sora just made it impossible to ignore.

All essays
RSS