The AI Subscription Buffet May Not Last Much Longer
The AI subscription deal is starting to change. AI subscriptions were never really unlimited, but they often felt that way. For a monthly fee, users could experiment freely, try bigger models, run coding assistants and push generous limits without thinking too much about the meter. That expectation is now being reset. As coding agents and long-running AI workflows burn through compute in ways simple chatbots never did, Anthropic, OpenAI, GitHub and others are redrawing the line between included access and usage-based billing. The AI subscription buffet may still be open, but the plates are getting smaller, the premium dishes are moving behind higher tiers and the meter is starting to matter.
The promise of AI is that we’d use it everywhere for everything. AI companies widely supported that idea by insisting the only way to get real benefit and ROI was to use AI for everything that’s important. The AI vendors supported that by making models easy to access and use, and to expect abundance. You could just pay a flat monthly fee and use as much as you like within generous limits. New models would come out all the time, and you could just try the biggest model, throwing everything you have at it. Ask as much as you like.Run the coding agent through a messy repo. Keep a dozen research threads open. Then things started to crack.
The latest signs that the free-for-all buffet might be coming to an end came from Anthropic, where a brief test appeared to remove Claude Code from the $20 Pro plan for some new users. The implication was that one of the most powerful tools that Anthropic has, Claude Code, might only be available to those on the highest subscription tier, Claude Max, or on a pay-per-token API basis. Ars Technica reported that Anthropic was experimenting with new ways to ration Claude Code access amid “untenable demand.” Business Insider reported that Anthropic later described the move as a limited experiment affecting about 2% of new users, not existing subscribers.
Those who found out about it were up in arms. It showed how sensitive customers have become to the fine print of AI subscriptions, and how quickly agentic AI can break the promise of flat rate plans. The concept of the flat subscription was attractive since it borrows from Netflix, Spotify and workplace software in which you can pay each month and use the product as much as you like. It also lowered the anxiety around experimentation when AI was, and still is according to many, in its early days still looking to prove its value. But then agents arrived.
Agentic Systems Changed The Economics
A chatbot session is one thing. After all, there’s only so much you can request or do on a synchronous basis when you’re pushing things into and out of prompts. But long running agent systems, whether they are coding or personal assistants like OpenClaw or Hermes, is another. A chatbot answers and stops. Agents keep poking around the repo, running tests and burning compute long after the first prompt.
GitHub has now said this directly. In an April 20 post, the company paused new signups for Copilot Pro, Pro+ and Student plans, tightened usage limits and removed Opus models from Pro plans. GitHub’s explanation was that Copilot users are no longer just asking for snippets or code completion. They are running agents for longer stretches, often side by side in parallel, and that burns more compute than the old plans allowed. GitHub said session and weekly limits are now tied to token consumption and model multipliers, and that a small number of requests can incur costs that exceed the plan price. This means in some circumstances, just a handful of deep code-refactoring or complex requests can exceed the plan price.
Anthropic Is Already Moving Toward Usage Tiers
Anthropic’s pricing moves as well as reliability and Claude Code performance issues show that these tools are becoming harder to operate at the scale and intensity users now expect. Its Max plan, launched in 2025, offers 5x Pro usage for $100 per month and 20x Pro usage for $200 per month. Anthropic is making it clear that compute-hungry workflows are becoming harder to fit inside a $20 plan.
Anthropic has gone a step further with extra usage for Claude subscribers. Its support page says Pro, Max 5x and Max 20x customers can keep working after hitting plan limits by switching to pay as you go pricing at standard API rates. Extra usage is charged separately from the subscription. Claude Code terminal usage counts together with normal Claude conversations against the same limits. Research mode may consume tokens faster. Documents in projects count toward context when used in conversations. Unlike Netflix in which you can binge watch until your eyes bleed, this is not unlimited access. It is an allowance with a meter. The pricing is closer to cloud consumption pricing vs. all-you-can eat subscriptions.
OpenAI is making a similar turn with Codex. Its Codex rate card now prices usage based on API token consumption, with separate rates for input tokens, cached input tokens and output tokens. OpenAI says this replaces average per message estimates with a direct mapping between model activity and credits. It also says Codex averages roughly $100 to $200 per developer per month, with wide variation tied to model choice, number of instances, automations and fast mode.
The company introduced new Pro plan options in April, including a $100 per month plan built for longer, high intensity Codex sessions, with a temporary increase in Codex usage versus Plus. It also released the Codex desktop app as a command center for managing multiple coding agents in parallel, including long horizon and background tasks. OpenAI is building products that invite heavier agent use, while also building the billing rails to meter it. OpenAI is building products that invite heavier agent use, while also building the billing rails to meter it.
Uptime And Performance Challenges
The pressure is not only showing up in plan limits. It is showing up in uptime, error rates and user perception. Claude has also had visible reliability problems, with elevated errors across Claude.ai, the API and Claude Code. These incidents include login failures and periods when Claude.ai loaded but could not process prompts, according to the report.
The company suffered not only availability outages, but also performance problems. Users complained that Claude was becoming forgetful, performing poorly with complex tasks and not responding well. Anthropic’s own Claude Code postmortem on recent performance provides some detail. Anthropic said users reported that Claude Code felt less intelligent after product changes around reasoning effort, caching and prompt length. The company said one caching bug dropped prior reasoning from later turns, which could make Claude forget why it had made certain choices. Anthropic wrote that the same issue likely drove separate reports of usage limits draining faster than expected. It also reverted a prompt change that had capped text between tool calls and final responses, after broader tests showed a drop in performance. Anthropic said the API and inference layer were not affected, and that the issues were resolved by April 20.
The combination of increasing limitations and performance challenges on flat-rate subscriptions leaves users frustrated. Users paying for an AI agent that needs to operate continuously to provide value are becoming more vocable about problems with model quality, product bugs, latency, usage limits and downtime. These problems all feel like the same thing: the tool did not work when needed, or did not work as well as it did last week. The more AI companies ask customers to treat agents as embedded work systems, the less users will tolerate consumer app style flakiness.
The performance and model quality problems aren’t Anthropic’s alone. OpenAI has had visible hiccups around the same tools it is trying to turn into higher value subscriptions. On April 20, its status page recorded a partial outage in which users were unable to load ChatGPT, Codex and the API Platform. Three days later, OpenAI listed elevated errors in Codex affecting GPT 5.5 usage, with some users seeing intermittent “not found” errors when starting or updating requests.
AI Subscriptions Are Starting To Look Like Cloud Bills
All of this doesn’t mean AI vendors are intentionally and maliciously degrading models. The real issue is that it costs a lot of money to serve these requests, and demand is growing at a higher rate than planned. At least in Anthropic’s case, executives have acknowledged that usage patterns changed faster than expected after products such as Claude Code and other agentic tools took off. Every token carries an infrastructure cost. That gives vendors a constant incentive to tune the line between better responses, faster service and lower cost.
The software industry has seen this movie before. Early cloud services promised elastic abundance. Then finance teams discovered runaway bills. What followed was increasing need for observability, financial management, reserved instances, budget alerts and workload shaping. AI is entering that phase of controlled access, but at a much faster rate. It has its own version of the old cloud headaches with long-running tasks that use expensive compute, runaway agent loops and swollen context windows that quietly turn into cost problems.
The real story is that the days of the simple all-you-can eat buffet might be coming to an end. Customers should stop judging AI plans by the headline price, and start thinking of it as a “base” price which only gets more expensive with more intensive use. Buyers need to ask what counts against the limit, whether tool calls and files draw from the same pool, how premium models are charged and whether finance teams can see token spend before the invoice comes or the credit card is charged.
For AI vendors, this is becoming a tricky time. As AI vendors continue to push the story that AI needs to be in every product, process and practice, they at the same time are clamping down on runaway usage that strains their ability to deliver promised results. This means that the business model for AI is due for a change.
The AI subscription buffet is not vanishing overnight. Plenty of users will still get generous access, especially for light chat, search, writing and productivity tasks. AI bundled into larger software suites may preserve a simpler user experience for light tasks, but those bundles will still have to absorb the same compute costs somewhere, whether through seat prices, feature tiers or limits. Users may explore locally hosted or open source agents to reduce dependence on metered subscriptions. That will not make compute free, but it may shift costs from vendor usage meters to hardware, hosting and maintenance.
Regardless, the direction is hard to miss. The early gold-rush phase of AI is giving way to a more sober business phase. The open question is who pays when millions of users start treating expensive models and long-running agents as everyday infrastructure?
Loading article...