Amazon employees are "tokenmaxxing" due to pressure to use AI tools

https://arstechnica.com/ai/2026/05/amazon-employees-are-tokenmaxxing-due-to-pressure-to-use-ai-tools/

73•Bender•49m ago

Comments

x187463•37m ago

Measuring token usage as a productivity metric is like measuring keystrokes. Don't mind me, just over here rolling my face on the keyboard for an hour so I can take Friday off...

...except each keystroke has an associated cost, the sum of which may equal or exceed my salary.

Weryj•35m ago

Insert photo of Simpsons drinking bird while homer sleeps here.

Analemma_•31m ago

What's nuts is how many intelligent people— people who would say "of course 'LOC written' is a terrible measure of developer productivity, of course only a dysfunctional company run by morons would do that"— have immediately bought into this. Amazon has token use mandates, I've heard Google has token use "leaderboards", friends at startups say they all get graded on tokens used. It's like watching your sensible, levelheaded friend go completely off the rails; collective madness.

Imustaskforhelp•21m ago

> collective madness

mass hysteria perhaps?

There used to be a time where people used to die from dancing too much (from my understanding in which hey I can be wrong, I usually am): https://en.wikipedia.org/wiki/Dancing_plague_of_1518

I think that although we wish to consider ourselves as smart and really intelligent but we run on biological machines and clocks which evolutionary have not much of a difference since 1518 or even the times when we used to hunt and forage for that matter.

HPsquared•20m ago

It's a test of practical intelligence.

greesil•19m ago

Some people respond to incentives. The rest of us are just trying to do our jobs and will probably be fired and then later consumed by the basilisk. We are living in an age of extremophiles.

guywithahat•33m ago

This reads more like it's a single employees gripe than a real thing that's happening. They're not using the metrics in performance reviews, and it's a new AI tool that AWS probably wants legitimate usage data out of.

That said, if you can't figure out how to use AI in a software job you should look into it. Not using AI at this point is a lot like not using CAD as an architect.

KyleTheDev•27m ago

It is being used in performance reviews, source: recent Amazon SWE.

They also use a bunch of dumb metrics like, total PRs submitted, total comments made on PRs, etc. To the point that, there are multiple heavily used internal tools to game these metrics. Eg, auto-comment LGTM on any approved PR. Thus, making the metrics even worse than they would have been prior.

sarchertech•25m ago

I think it’s real. I’m at a huge SV tech company and at least half the people here are “token maxing”.

AI is genuinely useful for many tasks. But 2x or greater business value from engineering orgs isn’t it. And even if it was business are terrible at measuring value added on an individual basis.

What they can measure though is token use. I’ve heard the same thing from other large companies my friends work for.

It’s bad enough that I’ve moved a significant amount of money out of US large-cap stocks.

riknos314•22m ago

Amazon has far more roles than just software. PMs, FC area managers, managers - if your job involves writing anything you're expected to be using AI in some capacity.

retinaros•5m ago

We can tell they are using AI

fg137•22m ago

"They're not using the metrics in performance reviews" means almost nothing. It doesn't mean managers at every level are not frequently looking at those numbers. Anyone from Amazon will tell you how much "hint" they get from management about using those tools.

bigstrat2003•21m ago

> That said, if you can't figure out how to use AI in a software job you should look into it. Not using AI at this point is a lot like not using CAD as an architect.

When LLMs are capable of actually doing a good job, then it might be like that. We are not there yet, and we may never be.

righthand•18m ago

I have been not using AI since the beginning and nothing has changed for me. I have only watched my coworkers and the industry get dimmer and get faster at getting dimmer. I have witnessed professionals become total amateurs and form “well the AI generated this unreviewed report” as their basis of knowledge.

No thanks I’ll just watch y’all slip down the slope.

mrhottakes•16m ago

Agreed. AI usage seems to be mostly bragging on HN / LinkedIn

mrhottakes•17m ago

> Not using AI at this point is a lot like not using CAD as an architect.

Does CAD software regularly generate an incorrect design that results in a catastrophic failure of the building?

12_throw_away•12m ago

> They're not using the metrics in performance reviews

Heh. No need to be ashamed, I used to believe them when they lied to me like this too!

HarHarVeryFunny•7m ago

Apparently it's real. Meta has a tokenmaxxing leaderboard too.

"Wow, look at how fast employee # 2 is setting money on fire! Let's promote him!"

some_furry•33m ago

Can't you just, wire your agent into a Python script and have it infinitely check its own work? That would hit the metrics, but do nothing useful.

Hell, throw a Tarot reading in the middle of the loop so the agent has non-deterministic behavior too.

https://github.com/trailofbits/skills/tree/main/plugins/let-...

Amazon management wants to play five-dimensional chess? Play Balatro instead.

tyleo•32m ago

I was thinking about this recently. I tend to run my AI at low context because the documentation states that they degrade with higher context usage.

However I see tons of people on LinkedIn with ways of backing up context, not wanting to lose context, etc.

This seems like another way the system is being misused. Higher context usage also uses more tokens. I suspect you get worse (and slower) output too than a dense detailed context.

jaggederest•17m ago

I think there are two motivations that get blurred pretty quickly:

a) you find a particular context that executes well and want to preserve parts of it or not have to repeat explanations

b) you want to continue a session so you don't have to rebuild the context from scratch

I think A is something where it's totally reasonable to preserve pieces as part of like a prompt library or equivalent, or directory-specific agent files, that kind of thing.

I think B is much more likely to lead to problems if you do it over a long time, but it can be pretty useful for getting the last drop of juice out of the metaphorical orange.

I think the antipattern (that I've done myself, admittedly) is swapping between different restored contexts for different tasks or roles - at that point you should be either converting it to more durable documentation if warranted, or curating it more specifically than "restore the entire context" even if it's just one-off.

mikepurvis•12m ago

I think the answer for both cases is supposed to be finishing a "good" session with "based on what you've learned about this project, please update the CLAUDE.md/AGENTS.md/README.md files."

Ideally that replaces the back and forth cycle of it's this, no it's that, it's that for reasons XYZ with a single ingestible blob that gets the agent up to speed.

mikepurvis•15m ago

I think the more you anthropomorphize it the more it feels like "but I don't want to have to start all over getting it up to speed, this instance already knows all the important stuff."

If every exchange is treated as an independent query/response then it's much easier to see how cutting out the fluff using a combination of its summaries and your own helps stay focused.

tapoxi•25m ago

I joked about this on HN a few weeks ago and I find it funny that we ended up here already. Goodhart's Law in action.

varispeed•25m ago

Someone pressuring to do something at work gives off creep vibes.

Is that in the contract to use AI tools? If not, then what are they on about.

woah•20m ago

They're always pressuring me into "shipping" "features"

ge96•18m ago

Think worst place I worked, you had to install an app like Time Bro and you had to account for all 8 hours of the day, some app logged per minute/hour.

varispeed•11m ago

Would rather eat dirt than work at a place like that. Respect.

mrhottakes•18m ago

"someone pressuring you to do something at work" describes pretty much all jobs

Very very few jobs in the US give you a contract.

i7l•21m ago

The fact that management signed off on measuring AI use through token usage shows how incompetent management really is, including in allegedly technical conmpanies like Amazon. Tokenmaxxing was an entirely expected and rational response. IOW You measure employees in stupid ways, you're going to get stupid behaviour as a consequence.

koolba•8m ago

Management loves numbers because they’re the only things you can objectively compare as X > Y.

It makes for pretty charts, extrapolations, and projections.

It doesn’t matter if the numbers are not particularly correct. As long as the data gathering step can be justified it’ll do. Though bonus points if making the number bigger is a good thing (v.s. tracking something like number of sev 1 issues).

delfinom•8m ago

Yes, but also because management is largely unqualified to be managing the stuff they are hired for. So they regress to numbers because they otherwise cannot participate in anything technical.

wordpad•8m ago

Depends on what they're trying to incentivise.

It's quite possible they aren't trying to measure performance but are literally just trying to increase token consumption to feed the bubble and hype.

Plus pressure employees may find new unique use cases for AI.

It's like if your goal is inflation, you give out tons of money and as long as its spent, you achieve your goal.

christkv•19m ago

Seems to be a clear case of Goodhart's Law that states that "when a measure becomes a target, it ceases to be a good measure."

FartyMcFarter•17m ago

That's true, but I don't know if this one was ever a good measure in the first place.

People use AI differently and they can be equally productive with a variety of token usage quantities.

Also, different kinds of work are differently amenable to using AI.

compiler-guy•14m ago

Measuring tokens used can absolutely be useful; tracking things like cost, compute-demand, usage to negotiate a better contract, and on and on.

Using it to grade people is, err, rather unwise.

pjmlp•17m ago

I can tell they are surely not the only ones.

Everyone I talk to has nowadays KPIs tied to AI usage on their performance evaluation.

jnpnj•6m ago

Corporate emails asking why are you not using the <insert-llm> paid plan ??? came very very rapidly. So naturally, everybody started to use it blindly so that the dashboard metrics are all high.

It's astonishing how society forgets.

H8crilA•6m ago

The most important skill is to not stand out of the crowd. This is how you survive in the Soviet Union, in the army, and clearly also at tech companies.

Argonaut998•16m ago

I swear the industry is being Garry Tanned.

Senior management let go our localisation staff. Now they want us to use AI to translate. They still want manual review.

We use Github Copilot at work, we get a measly 300 requests with the budget to go over if necessary. Opus 4.7 or GPT 5.5 would eat all of those up in a day. Are we supposed to be using more than the allotted amount, do management see that as a good thing. Or is it best to stick within the allocated amount. Who knows? Management are playing games everywhere it seems.

nextlevelwizard•13m ago

How you burn 300 requests in a day? From my Copilot usage Opus consumes surprisingly few requests to do a lot of stuff. It isn’t paying by token but instead by prompt or something.

devmor•9m ago

If you are using subagents for asynchronous work, you can burn through 300 requests in a workday easily.

ex-aws-dude•15m ago

Imagine selling a product where companies are foaming at the mouth to increase their spend and pay you more money

It does not get any better than that

Jensen, Sam, Dario: https://i.imgur.com/AI7rtCY.jpeg

baxtr•15m ago

“Show me the incentive and I'll show you the outcome.”

― Charlie Munger

asdfman123•10m ago

Would that make chasing perverse outcomes in the corporate environment the Munger Games?

asdev•13m ago

People who don't code(management, leadership) think AI will 10x the company but it's really a 40-60% boost. But engineers have to feign adopting this tools in fear of layoffs

asdfman123•10m ago

40% boost for smart engineers, for now.

People churning out slop is slowing me down and the full effects of it won't be felt for a while.

retinaros•6m ago

Its not really a 60%. It accelerates a lot code creation. Save some time on admin tasks. That is it.

oytis•5m ago

> 40-60% boost

Where? What industry, what kind of projects? The only one where I can imagine it to be true is vulnerability research, and I imagine all the low-hanging fruit to be picked soon

ortusdux•11m ago

Reminds me of the managers that use 'lines of code added' as a metric

asdfman123•11m ago

Saw a good joke on twitter about it. Something like:

"You spent $23, over the $20 food limit. Be more careful next time. You spent $600 on tokens, $200 more than the average. Congratulations!"

HarHarVeryFunny•10m ago

> They said the move reflected pressure to adopt the technology after Amazon introduced targets for more than 80 percent of developers to use AI each week, and earlier this year began tracking AI token consumption on internal leader boards.

This measuring of tokenmaxxing as a proxy for something beneficial to the company has got to be the single dumbest thing I have ever heard of in my entire software career.

It would be like some company in the dot com era measuring employee's internet download traffic as a proxy for productivity or internet-pilledness.

Why not just reward employees based on who's submit the largest expenses claims? That might have some correlation to work too, right ?!

asdfman123•7m ago

In the corporate world it's impossible for any one person to tell what's going on across multiple domains due to the complexity. If I tell you the Zorbulon API is creating 30% more flargs (which is critical for Twiddle operation), I often just have to take your word for it.

Hell, I'm in the bowels of Google as an IC and it's hard to understand what adjacent teams are doing. Even harder for management that never gets their hands on anything.

So while you know engineers are probably bullshitting you with fake work, you can at least turn around and tell your supervisor the numbers. It's all a game of plausible deniability.

morelandjs•9m ago

I have mixed thoughts on this. These thoughts are my own. On the one hand, it’s objectively silly to pretend like we’ve solved the age old problem of measuring developer productivity. Metric-obsessed leadership can also be intolerable, counterproductive, and it’s a good way to paint yourself into a corner undervaluing your best talent and overvaluing your mediocre talent.

That said, I’m kind of having a blast using CC in corporate with all the connectors available at our disposal, and I baffled how little some of my coworkers know about what’s available and what the capabilities are. So it’s clear that perhaps some encouragement is prudent for those who are slower to embrace new technologies, but I’m not sure tokencounting and tokenmaxing are the answer.

retinaros•7m ago

Could you list us some of the capabilities you use that bring value besides “summarize my email”

retinaros•9m ago

Vibecoded ppt, docs, frontends is an even bigger scam than crypto ever was. Ofc people getting sucked into it

dogscatstrees•5m ago

Another stupid meme-latching name. Don't normalize these *maxxing nonsense words and just use plain language. Let's see, maybe just say they were optimizing for token count?

Show HN: Kplane – Isolated cloud environments for AI agents

Why do tubes sound different than transistors?

Is This Why Science Advances One Funeral at a Time?

Collaborate – a Claude skill for multi-person AI-assisted document writing

The Exception Economy

Where do we think we're going with proprietary AI?

Wigle: Crowdsourced network map of BT, BLE, WiFi points via wardriving phone app

In a quest to becoming AI-independent

Brilliant Labs brooks no dissent as the Halo SNAFU continues

Ask HN: Job Search Skill for Claude?

Microsoft's $1B AI data center will "switch off half of Kenya"

Original Sachertorte loses in blind test against cheap supermarket brands

Mass Supply Chain Attack Hits TanStack, Mistral AI NPM and PyPI Packages

Radar Trends to Watch: May 2026

Platform Timing Is a Strategic Decision

Googlebook, Designed for Gemini Intelligence

Microplastics turn up in nearly every human brain sample

Requirements analysis for agents: catch requirement bugs before they become code

Hardening TanStack After the NPM Compromise

Probe: AI Agent Context Engine

Show HN: Vibe – Responsible AI Review for Cq (Stack Overflow for Agents)

Leak reveals Google's Aluminium OS with a 16-minute video

The Great Zombification

World Models: Things That Matter in AI

Ask HN: How can I get a bank account for a minor owned LLC?

Show HN: One-shot NAT traversal library

Show HN: Profine – Profile and rewrite your PyTorch training loop on real GPUs

Vaultly Business News

Ledgr – Self-hosted finance app with Plaid bank sync and an MCP server

Pi Agent Being Merged?