From Tokenmaxxing to Tokenminimizing: Big Tech Discovers That AI is Too Expensive
The obsession with tokenmaxxing seems to have worn off. A year ago, big tech companies were organizing internal rankings to reward employees based on AI tokens consumed; today, AT&T, Meta, Uber, Walmart, and Amazon are imposing spending limits, and the term that describes the turnaround has become its opposite: tokenminimizing.
The shift has concrete causes. The companies most aggressive about AI spending now spend $7,500 per employee per month; agentic tools, which recall the model in repeated cycles, have tripled enterprise bills even as the cost per single token collapsed, resulting in unit price savings being entirely absorbed by volume.
Business Cases
Uber exhausted its entire annual AI budget for 2026 by April and is now applying a cap of $1,500 per month per tool for employees. Meta has sharply reversed direction: after months in which the teams competed to see who consumed the most, it is reducing spending on Anthropic and other internal AI tools. AT&T has limited some employees' access to GitHub Copilot. Walmart has introduced a fixed token allocation for Code Puppy, its internal agent, where consumption was previously unlimited. Amazon has eliminated its internal ranking on AI usage after employees manipulated it: to climb the ranks, they were submitting artificial tasks to the model, inflating counters without producing real value and driving computation costs through the roof. Microsoft has found a similar dynamic on an individual scale: some engineers were spending between $500 and $2,000 per month just on Claude Code tokens.
How Companies Control Costs
To reduce bills without cutting access, companies are adopting three approaches. The first lever is the shift to economical models: for simpler tasks, they are abandoning frontier models in favor of open-source or low-cost alternatives. The second is centralized spending control: Microsoft and Databricks have launched gateway tools to monitor and limit staff's AI consumption in real-time. The third is the automatic task routing: Factory, funded by Nvidia and valued at $1.5 billion, has just launched a model router that directs less complex tasks to cheaper models.
Satya Nadella framed the risk of concentration in a recent essay: "The last thing we want is a world where every company in every industry surrenders value to a few models that devour everything they see." Spending limits control costs, but they also risk slowing the productivity gains that justified the investment.