← All posts
Local AI Jun 11, 2026 3 min read

The "unlimited" AI plan was always going to end

The big AI companies are shifting from flat "unlimited" subscriptions toward paying for every token you use. Here is why that was inevitable, and why running your own models and tools is the best protection against being priced out.

Opinions are not my employer's.

If you have built any part of your work around a flat-rate AI subscription, watch which way the pricing is moving. The “unlimited” plan is on its way out, and pay-for-what-you-use is taking its place. This is not a temporary promotion ending. It is the predictable next step, and it is worth understanding before it reaches you.

The playbook is older than AI

Offer a generous flat plan, win a lot of customers, and wait for the product to become part of how those customers work. Once it is, the terms change. We have watched this happen with ride-share, with cloud storage, with streaming. AI is following the same arc, just faster. The “unlimited” tier was a cost of acquiring you, not a promise about the future.

Why AI makes this almost certain

Every request you send costs the provider real compute. Under a flat plan, your light months pay for everyone’s heavy ones, and that math only holds while most people barely use the thing. The better these tools get at real work, the more of it you run through them, and the heavier a user you become. Now add scarce compute and investors who eventually want margins. Metered pricing is simply where this lands. The cruel part is that your bill grows for the same reason the tool is worth having: because it is actually useful.

The part that should worry you

The risk is not the price today. It is the loss of control. When your workflow depends on a meter you do not own, someone else sets your marginal cost and can move it whenever they like. Your margins, your roadmap, and your ability to quote a fixed price to your own customers all sit downstream of a pricing page you have no say in. Dependence is the real exposure, not the current rate.

What actually protects you is owning your tools

The hedge is optionality. Run open models on hardware you control, and your marginal cost becomes electricity and time instead of a per-token rate that can change under you. Local models have quietly gotten good enough that a lot of everyday work does not need a frontier model at all: drafting, classifying, extracting, summarizing, and a surprising amount of coding. Just as important, the tools and harnesses around those models can be ones you own rather than rent.

This is most of what I build at L8NTLABS, and it is why I spend my own time on local-first projects. If the plumbing is yours, a pricing change is an inconvenience instead of an emergency.

Be honest about the tradeoffs

Local is not free and it is not always better. Hardware costs money, and the hardest tasks still favor the best hosted models. The point is not to run everything yourself out of principle. The point is to not be fully captured. A sensible setup uses hosted models where they clearly earn their cost, and local models for the steady, high-volume work where a metered bill would quietly eat you alive.

Own your critical path

Decide which parts of your work cannot afford to sit at the mercy of someone else’s meter, and move those onto tools you control. You do not have to walk away from hosted AI to do this. You just want to make sure that the day the unlimited plan ends, it is a minor annoyance for you and not an existential one. The companies are telling you where this is going. It is worth listening.

pricinglocal modelsstrategy
Work with me →