frontpage.

Value for Money Is All You Need

1•BEKOUTI•1h ago

Value For Money is All You Need

A reflection on the future of token consumption in artificial intelligence

Token consumption now sits at the center of the growing use of artificial intelligence by businesses and individuals alike.

The "TokenMaxxing" trap

In the early days, the trend was to maximize token consumption from proprietary LLMs, regardless of cost — a practice seen as a marker of performance for the user, the employee, or the company. This phenomenon, known as "TokenMaxxing," reportedly exhausted Uber's entire annual budget in under a year.

Faced with the enormous financial cost this TokenMaxxing generated, many companies and individuals turned to lower-cost LLMs to preserve their budgets — fueling the rise of Chinese open-source LLMs such as DeepSeek, in line with Harvard professor Clayton Christensen's theory that disruptive innovation can conquer a market through low prices.

Users thus found themselves facing a dilemma: choose a highly capable but token-expensive proprietary model, or a less capable but more budget-friendly open-source model.

The temptation of dumping

To resolve this dilemma, Sam Altman, CEO of OpenAI, promised to lower the cost of OpenAI's tokens — aiming to stand out from the competition, gain ground in the AI space, and make his highly capable models more accessible in terms of token cost.

While commendable, this initiative exposes OpenAI to two major risks:

A considerable financial risk: this token dumping could negatively impact OpenAI's profitability, making the strategy difficult to sustain over time. A market risk: dumping in no way guarantees an increase in OpenAI's market share against a competitor like Anthropic, since users remain willing to pay a high price if they can afford it — and if the expensive tokens they purchase generate returns far superior to those of cheaper tokens.

Current initiatives around token utilization

To resolve this cost-versus-quality trade-off faced by users, a new philosophy is now emerging: that of cost efficiency. Several interesting initiatives reflect this shift:

OpenRouter merges models in an attempt to reduce costs while still providing access to the most powerful models available — but the operation of its AI agents generates considerable hidden costs. Chinese open-source models such as GLM 5.2 are highly capable and cheaper than proprietary models from OpenAI or Anthropic, while still being notably more expensive than other open-source models. Ponytail strips away everything superfluous in code to preserve only the essential, thereby reducing token cost while preserving quality regardless of the LLM used — but it risks being too minimalist and insufficiently flexible to understand the context in which a user introduced lines of code that are essential to them, but which Ponytail might judge as superfluous. Headroom promises, through compression, to cut token costs by 95% — but the hidden costs tied to running its AI agent risk undermining this commendable goal.

Ultimately, all of these projects are commendable and worth encouraging, as they help address a problem that still stands in the way of the broader adoption of artificial intelligence.

The real challenge: Value For Money

In my view, the real challenge lies neither in price, nor in quality, nor in a performance-cost trade-off, nor even in cost efficiency. The real challenge lies in Value For Money.

Value For Money rests on three cumulative criteria:

Cost Quality Protection against risk(s)

Together, these three criteria deliver the best quality, at the lowest cost, with the least risk.

A new philosophy

Value For Money is the new paradigm that should guide AI labs and companies in how they approach token usage. That is why I am currently working on a project — soon to be available — to help remove, together with anyone willing to join me on this journey, the obstacle that token consumption represents

Apple Price Increases, Apple Intelligence and the E.U

SpaceX handed lowest possible ESG rating by MSCI

Zombie Unicorns Are Haunting Silicon Valley

Security Is Not a Technical Problem

I evolved 1,513 strategies to snipe memecoins; 0 survived an honest backtest

Convert HTML to WordPress

Show HN: Duckle a drag-and-drop visual pipeline designer

Britain's prime minister to step down, Burnham puts himself forward as successor

GateMem: Benchmarking Memory Governance in Multi-Principal Shared-Memory Agents

Show HN: A site where people pay me money for no reason

Value of mid-tier USA-manufactured knives

Homogenization

Bipartite Matching Is in NC

The deepest and longest subsea road tunnel

Uniform

Judging beautiful docs, AI fatigue, and tool slop

Complete OS Rewritten in JavaScript

Epidurals Are a Miracle Technology

LLMs do not merely reflect the bias of their training, they police it

An update on FortiBleed – what's happening with victim orgs

The Reason Bosses Want You Back in the Office Full Time

A Collection of 30 Open Source Chess Applications

We taught a map to read the water – from orbit, not from a crowd

Show HN: A Prometheus exporter for Shelly Gen 2 devices

Rive, Fast and reliable background jobs in Go

Manticore Search 27.1.5: Auth, sharding, conversational and faster vector search

System call instrumentation on Linux/x86-64 using memory-indirect calls

Baker Motor Vehicle

A 0.5s periodic hitch in my local screen stream turned out to be macOS AWDL

I Do Not Recommend Google Hardware