frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Uber caps employee AI spending after blowing through budget in four months

https://techcrunch.com/2026/06/02/uber-caps-employee-ai-spending-after-blowing-through-budget-in-four-months/
58•notfried•1h ago

Comments

ChrisArchitect•1h ago
Related:

Uber’s COO says it’s getting harder to justify money spent on tokenmaxxing

https://news.ycombinator.com/item?id=48268871

Uber torches 2026 AI budget on Claude Code in four months

https://news.ycombinator.com/item?id=47976415

Corporate America Is Starting to Ration AI as Cost Skyrockets

https://news.ycombinator.com/item?id=48335388

socketcluster•1h ago
It makes me wonder about the state of their codebase if devs needs to consume more than $1500 per month.

It's interesting that AI is finally forcing businesses to think about coding maintenance costs though.

When I started working on https://saasufy.com/ as a dev tool many years ago, I was frustrated that no big company cared about software maintenance costs and I really couldn't imagine a world where maintenance costs would be a problem (which is what my platform was addressing). So this is one positive thing from my perspective, I guess. But how much longer before people put 2-and-2 together and realize that architectural complexity is the leading cause? That's the real moment I'm still waiting for.

Will what's left of the socio-economic system be sufficiently capitalist that I will be able to capitalize on that? That's my next problem.

cactusplant7374•1h ago
If for each story the developer needs to fetch context for 10's of micro services I could see them using a lot of tokens.
tsvetkov•1h ago
Why do you think the cap has anything to do with the quality of their codebase? Employees could've been tokenmaxxing for various reasons: learning, experimenting, trying to impress the management, ... Naturally, this leads to AI spending skyrocketing while the business value may not be totally clear. Which leads to caps being introduced to keep the budget under control and discourage/limit tokenmaxxing.
socketcluster•25m ago
It's based on my experience as a software engineer who has worked on both clean and messy codebases with AI.

It's a very different experience with a messy codebase. In this case, the agent spends most of its time trying to gather the relevant context and it's like a game of whac-a-mole. The agent burns through tokens and can take a long time to resolve the issue with a lot of human intervention required. I would say it takes possibly just as long or longer than a human engineer would. Also, psychologically, the temptation for the engineer to trust the AI is massive because they don't want to load themselves up with all that ugly, complex context. They are more likely to let the agent create more hacks on top.

On a relatively well-structured codebase with loose coupling and high cohesion, the experience is usually very positive, mind-blowing, even; because it feels like the agent is reading your mind and fast-forwarding you. You don't need to correct it as much. And when you do, it's usually minor things.

The first case represents a net loss of value because tech debt is being added and compounding the complexity each time a problem is 'solved'. On the other hand, the second case is a significant speedup, for me, I would say it's at least a 5x speedup. I love using AI in this way. I'm in control and not at the mercy of the agent.

prymitive
baq•1h ago
They’ll switch to DeepSeek right when Anthropic IPOs. Amazing timing
bijowo1676•1h ago
thanks to OpenAI/Anthropic's eye watering valuation and token pricing, the software engineers get to live another day without layoff, because carbon based lifeforms are cheaper than silicon based lifeform for now...
nate•1h ago
It's funny the convos I now have with Sonnet that I wasn't having with Opus. I feel like most of us here are starting to be told to draw down some of our 1M Opus xtrahigh thinking tokens :)

Is anyone using a local router to deal with that? Something thats like "don't even bother with sonnet for this task, just go with Opus". I wonder if Haiku could even do that math and recommend the model you should be in?

zwigglers•54m ago
The version that probably works better is triaging in advance what's definitely not Opus territory: summaries, documentation, test generation.
jaggederest•50m ago
my task workflow uses something like opus to evaluate the roadmap, sonnet to divide the tickets by complexity, and then dispatch them to the relevant models - I use haiku or openai's spark models (spark is FAST! and DUMB!) for the simplest, and ascending in complexity. I find mid tier sonnet and gpt5 are pretty competitive, and reserve opus for truly "rearchitect the app from scratch" style tasks.

But all that might be somewhat obsolete, the latest update for claude code looks like it uses workflows with various models, so they might already be optimizing that.

rluna828•1h ago
Claude's Law: "Token consumption grows faster than the cost per token falls."

The Red Queen's Haiku Run faster, she said— each cheaper token consumed to hold the same place

Mr. Meeseeks' Law: "An agent that cannot finish a task spawns another agent to help. No task reveals its difficulty until it is attempted; as such, the cost of any unattended task can exceed it's value"

willis936•1h ago
Hi I'm Mr. Meeseeks.
andyferris•1h ago
I’m confused why a business would allow (non-data-science/agent harness devs) to pay per token instead of eg an Anthropic business premium seat? A monthly subscription seems pretty straight forward for the accountants, no?
Drakim•1h ago
You aren't allowed to use the same super cheap subscriptions if your company is big enough.
ericcholis•1h ago
I believe that Enterprise plans have no bundled usage.

https://support.claude.com/en/articles/9797531-what-is-the-e...

gnabgib•1h ago
That no longer exists.. Anthropic business is now seat + usage billed.

https://claude.com/pricing#team-&-enterprise

sunshowers•52m ago
At this point the subsidized rates are only available with individual plans. In principle your workplace can pay for an individual plan for you, but for compliance reasons that is likely only feasible at smaller places that are primarily open source oriented (so there's little risk of proprietary code leaking).
cute_boi•1h ago
First they bragged about using so many tokens; now they cap it once they hit the bottom line, lol.
analogpixel•1h ago
I find it kind of funny that all these companies were token-maxing while the AI companies are giving services at huge discounts costing the AI companies tons of money just so the people can get on leader boards at work. How much has Anthropic and OpenAI spent on just people wanting to get on the leader board at work (or worse, how many trees have been burned down just to get on the leader board at work.)
rnagulapalle•1h ago
yup!!!everone is hurrying without checking the value..
anon291•59m ago
Trees being burned down is not a valid argument against AI as we have unlimited energy available should we choose to build it.
maplethorpe•1h ago
Isn't inference cheap? Why are AI labs charging so much for it?
slashdave•57m ago
Define "cheap".

If you mean cheaper than training, sure

rnagulapalle•1h ago
there coo already called out in public .. its hard to measure!!!https://www.businessinsider.com/uber-coo-andrew-macdonald-ai...
ck2•1h ago
I read somewhere this morning there is now more spending on datacenter infrastructure for "AI" in the US than all other infrastructure combined, roads, bridges, ship ports, etc.

Sounds plausible but I doubt it outmatches ICE warehouse concentration camp spending

Which is now the future of this country unless we force a course correction, by 2029 you'll drive down highways and it will just be one datacenter and ICE prison warehouse after another

I do not understand why you need as many GPUs powered up than people in the country or even a 1:10 ratio, it's all going to sit idle until they find something practical to do with "AI" other than entertainment purposes because it's not profitable, how are they going to monetize it, they cannot

cletus•57m ago
If I were the CTO of any of these companies I would be working my butt off to be making an internal version of Claude. Let me explain my reasoning using Google as an example (disclaimer: Xoogler).

Google has a lot of systems to make a very large monorepo manageable so builds and code search don't take forever. The build system is Blaze (on which Bazel is based), which has a Pythonic syntax and was once Python but that hasn't been the case (AFAIK) for over a decade. This means you build a massive digraph of build artifacts. By "large" I mean somewhere between 100M and 1B vertices (guessing). Loading that became a significant problem for a build so there's heavy caching around that. There's also heavy caching around build artifacts (ie Forge).

So, part of the issue with every developer using Claude is that you have a ton of inefficiency becasue everybody has a significant context. And what is context really? It's not too dissimilar to the build graph and/or code search you already have.

So the infra I would be working on would be some kind of "global context" or "context cache". Now a lot of context changes when you do a local change but a lot doesn't. As an ordinary engineer, you aren't generally modifying /base. You're modifying leaf nodes or branches for very few leaf nodes.

The reasons I see to do this are:

1. Cost-savings by deduplication;

2. Speed if context is partially-cached;

3. You avoid issues of sending out your codes to third-parties. In the case of Google or Amazon, if they use Claude at all, they would probably only be using their own clouds so they avoid this. But Uber doesn't have that luxury;

4. You avoid any issues of people using your prompts for responses for training and leaking any potential sensitie information that way;

5. You can use off-peak resources for a lot of this work;

6. You can control resources within your own pervasive resource management (in the case of Google); and

7. You can more easily integrate into internal tooling.

I also think that expanding compute power is the biggest risk to Anthropic (and OpenAI). There's a vast difference between a model you need a cluster of NVidia's finest to run vs one you can run on a Macbook Pro. We aren't there yet on a Macbook Pro but it'll only be a few years we are.

minimaxir•51m ago
The costs of a) selfhosting a >100B param LLM model b) scaling it to a full company and c) maintaining it are all significant risky investments that is even more expensive in the short term.

Those are generally the core reasons most SaaSes exists. Additionally, (a) is the biggest issue because there is no open-weights model that can match GPT 5.5/Opus 4.8.

baggachipz•56m ago
Wait until the true cost of using these LLMs comes home to roost as these companies scramble to stop losing gobs of money. Current prices are still heavily subsidized.
glimshe•56m ago
We're going to see a 180 degree turnaround and a new metric soon: the less you spend, the better your yearly review. Going above quota will require syncs, forms, manager and VP approval etc.
radiator•44m ago
That used to be the normality. You want to spend company money, you need to justify it.
__natty__•55m ago
Maybe one day companies will optimize AI costs by hiring people?
bijowo1676•43m ago
pretty sure ChatGPT tokens should be cheaper than the CEO pay (Uber's CEO pay is $36,000,000+)

I don't understand why CEO doesn't optimize and automate himself out of the job, like the software engineers are told to do

defmetrix•54m ago
I dont think anyone is surprised. Im sure many employees were going wild will all sorts of useless "projects".
neals•51m ago
So... did we just basically produce a lot of heat in a bunch of datacenters? Not a lot of value?
zoogeny•47m ago
This is a contrarian view and I am a biased AI-maximalist. But I actually think these kinds of results are genuinely important.

There is a lot of frustration and even anger over CEOs pushing AI onto employees and some schadenfreude when it goes wrong. But there is some element of "fail fast" happening here.

I am glad wealthy corporations are footing the bill by stretching this technology to its limit. The fact of the matter is, we don't know how effective the best-of-the-best models are at scale.

There is a feeling that once we figure out how to leverage these agents, we'll see explosive growth. It's just going to cost a lot of money figuring it out.

It seems that for now, handing over 100% of code writing to LLMs is going to be too expensive. Cost per token for equivalent code is too high.

josefritzishere•37m ago
I have a feeling it's not going to be magic and will obey the laws of Supply and Demand like all other tech products; further that it's hugely over valued and is going to crash like a meteor before it's over. But we'll all find out together, right?
zoogeny•26m ago
Yes, right now all we have are vibes/feelings. My point was that one benefit of the hype and the "CEO psychosis" is that we'll find out together fast. Uber, and companies like it, have the money to take the kind of risk that accelerates learning.

And the first data point is in your favor, kind of. I mean, Uber engineers were sufficiently incentivized to use the tokens they were given. It isn't easy to determine what the exact motivation was. What might result from this latest round of CEO backtracking is either relief (don't have to pretend to use AI anymore) or frustration (upset at a useful tool being taken away).

There are two possible stories here. One, they forced everyone to use AI and didn't get enough benefit to justify the cost. Two, they gave the opportunity to their employees to use unlimited AI and those employees jumped at the chance with a vigor that management didn't expect.

All we really know is that value per token must have been low enough to cause this change.

_fat_santa•22m ago
At my company we're using Claude Code w/ API Billing and I found that unless you're running ralph loops on Opus with extended thinking, it's very hard to blow through more than $200/mo.

I made this argument earlier and I'll make it again, I think a major contributing factor to AI budgets exploding is the token leaderboards, culture of "tokenmaxxing" and the the constant narrative that if you're not burning X tokens a month, you're not a good engineer.

GiorgioG•11m ago
Nobody saw this coming...nobody /s
•
42m ago
I have no idea how much I’ve spent, it’s invisible to me, the company doesn’t share it with me. I have no idea what “1 credit” means in terms of $$$, is that 1$? 0.1? 0.01? Is it even a fixed price? I have no idea how much will given take cost. Well, I can ask for a plan and extrapolate from that, but all perfectly reasonable looking plans eventually end up in a rabbit hole. Providers keep introducing new models and each is more expensive while offering modest improvements, it’s a silent inflation.

So I personally can easily believe that. Especially that a lot of people will just try to see if model can make that huge improvement / refactoring they’ve been hoping to do a reality, or tons of experiments to validate ideas.

lijok•51m ago
Are you describing finetuning?

MAI-Code-1-Flash

https://microsoft.ai/news/introducingmai-code-1-flash/
280•EvanZhouDev•3h ago•131 comments

CT scans of BYD car parts

https://www.lumafield.com/scan-of-the-month/byd
87•viasfo•1h ago•20 comments

Gmail thinks I'm stupid, so I left

https://moddedbear.com/gmail-thinks-im-stupid-so-i-left
402•speckx•2h ago•240 comments

MAI-Thinking-1

https://microsoft.ai/news/introducing-mai-thinking-1/
144•LER0ever•3h ago•60 comments

Open Repair Data Standard – Open Repair Alliance

https://openrepair.org/open-data/open-standard/
57•cassepipe•2h ago•1 comments

My thoughts after using Clojure for about a month

https://www.acdw.net/clojure/
42•speckx•2h ago•2 comments

HP re-releases classic computer science calculator: The HP-16C

https://hpcalcs.com/product/hp-16c-collectors-edition/
69•dm319•3h ago•39 comments

A walking tour of surveillance infrastructure in Seattle (2020)

https://coveillance.org/a-walking-tour-of-surveillance-infrastructure-in-seattle/
349•eustoria•8h ago•211 comments

Show HN: Live breath detection and biofeedback from a phone microphone

https://github.com/shiihaa-app/shiihaa-breath-detection
15•felixzeller•6h ago•5 comments

Adafruit receives demand letter from Fenwick legal counsel on behalf of Flux.ai

https://blog.adafruit.com/
564•semanser•12h ago•234 comments

The advertising cartel coming to your web browser

https://blog.zgp.org/the-advertising-cartel-coming-to-your-web-browser/
85•speckx•2h ago•26 comments

How we index images for RAG

https://www.kapa.ai/blog/how-we-index-images-for-rag
51•mooreds•5h ago•7 comments

Trump signs downsized AI order after weeks of reversals

https://www.politico.com/news/2026/06/02/trump-signs-downsized-ai-order-00946389
138•_alternator_•5h ago•92 comments

Launch HN: Rudus (YC P26) – AI for concrete contractors

29•rishipankhaniya•3h ago•13 comments

Bringing Up DeepSeek-V4-Flash on AMD MI300X

https://fergusfinn.com/blog/deepseek-v4-flash-mi300x/
63•kkm•4h ago•6 comments

QBE – Compiler Backend – 1.3

https://c9x.me/compile/release/qbe-1.3.html
60•birdculture•4h ago•11 comments

Multicore suppport for DOS is real – partly

https://www.vogons.org/viewtopic.php?t=111336
33•beebix•2d ago•7 comments

Review of the MoErgo Glove80 Keyboard

https://arslan.io/2024/04/22/review-of-the-moergo-glove80-keyboard/
10•akyuu•1d ago•2 comments

GitHub Copilot App

https://github.com/features/preview/github-app
87•theanonymousone•4h ago•60 comments

Why Janet? (2023)

https://ianthehenry.com/posts/why-janet/
413•yacin•12h ago•220 comments

Expanding Project Glasswing

https://www.anthropic.com/news/expanding-project-glasswing
143•surprisetalk•8h ago•188 comments

Fidonet: Technology, Use, Tools, and History (1993)

https://www.fidonet.org/inet92_Randy_Bush.txt
136•BruceEel•8h ago•49 comments

Preparing for KDE Plasma's Last X11-Supported Release

https://blog.davidedmundson.co.uk/blog/596/
121•jandeboevrie•7h ago•149 comments

Love systemd timers

https://blog.tjll.net/you-dont-love-systemd-timers-enough/
312•yacin•12h ago•205 comments

Age verification for social media, the beginning of the end for a free internet?

https://mullvad.net/en/blog/age-verification-for-social-media-the-beginning-of-the-end-for-a-free...
418•StrLght•22h ago•315 comments

Great Question (YC W21) Is Hiring Applied AI Interns

https://www.ycombinator.com/companies/great-question/jobs/J5TNvQH-ai-engineer-intern
1•nedwin•10h ago

Show HN: RePlaya – self-hosted browser session replay with live tailing

https://github.com/s2-streamstore/replaya
33•shikhar•4h ago•5 comments

Microsoft announces Scout, an autonomous AI agent built on OpenClaw

https://www.computerworld.com/article/4180103/microsoft-unveils-scout-an-autonomous-ai-agent-buil...
68•EvanZhouDev•3h ago•62 comments

BQN: What Is a Primitive?

https://mlochbaum.github.io/BQN/commentary/primitive.html
30•tosh•3d ago•2 comments

Made a Tool to Streams Changes from Microsoft SQL Server to Apache Kafka

https://github.com/Niyko/Athena
11•hyvr_official•2d ago•2 comments