Uber's $1,500/Month AI Limit Is a Useful Signal for AI Tool Pricing

https://simonwillison.net/2026/Jun/3/uber-caps-usage/

35•pdyc•1h ago

Comments

ChrisArchitect•50m ago

Uber’s COO says it’s getting harder to justify money spent on tokenmaxxing

https://news.ycombinator.com/item?id=48268871

Uber torches 2026 AI budget on Claude Code in four months

https://news.ycombinator.com/item?id=47976415

Corporate America Is Starting to Ration AI as Cost Skyrockets

https://news.ycombinator.com/item?id=48335388

PessimalDecimal•32m ago

These are still at currently subsidized prices. We'll see if they think they're getting $1500/month of value when that buys significantly fewer tokens.

boringg•29m ago

True but they will raise prices slowly so people will optimize their workflow so they aren't just throwing as much inference as fast as possible like the current state. Right now you should do everything you wanted to try out because it is cheap (as long as you don't become dependent ... the risk).

pdyc•29m ago

afaik, enterprise plans are not subsidized. its 20$/seat+api pricing. Unless you are saying api pricing itself is subsidized.

LurkandComment•27m ago

This is market introductory pricing that hasn't factored in cost recovery. Most of it has been run on early investment with the assumption they will recover costs in the long run. The prices are subsidized across the board and they will need to go up signficantly to recover them.

logancbrown•24m ago

None of what you said is true

rimliu•19m ago

And you know this how?

swiftcoder•23m ago

Assuming this were accurate, then presumably the AI companies would be betting that inference costs come down before the bill is due - I don't see enterprises being willing to absorb another ~10x price increase for tokens (as they've just done going from subscription prices to per-token pricing)

square_usual•28m ago

There is no evidence that per-token inference prices (which is what Uber is setting a cap on) is subsidized.

lelanthran•24m ago

Is there any evidence that it's not?

thejazzman•21m ago

Yes; they ban various uses of their subscriptions but say you can do whatever if you’re paying for the API without limits

lelanthran•17m ago

That's not evidence. Very likely though, but the only evidence we get one way or another is when they IPO.

simonw•9m ago

This story isn't about those subscriptions - enterprise customers like Uber are paying the full API prices.

Topfi•13m ago

The fact that Anthropic models are offered at the same API pricing by not just themselves but AWS, Azure and Vertex despite Anthropic taking a major slice on licensing along with the cost an open weight 1T parameter model like K2.6 costs to run on any third-party provider, make it unlikely that API inference cost are subsidized by the labs.

sourcecodeplz•18m ago

I understand current Codex $20 sub is worth about $480 GPT5 api credits.

MagicMoonlight•8m ago

It's not. They recently forced enterprise customers onto API billing instead of the cheap consumer pricing. Now the pricing is brutal.

CharlieDigital•29m ago

$1500/mo is $18,000/seat/annum.

Maybe Microsoft and Nvidia are on to something.

128 GB machines that can run local LLMs are a bargain even if priced $5-8k. Yes, tok/s is not quite there, but that's probably OK since the bottleneck really isn't the code; it's WTF did Uber build with all of that spend? How did it meaningfully impact their revenue in a positive direction?

dkdcdev•24m ago

at their scale they could also just run a large on-premise or rented (basically still cloud, but cheaper) GPU cluster and run through that. fixed costs, even license a SOTA model’s weights if you’d like

embedding-shape•21m ago

> even license a SOTA model’s weights if you’d like

Yeah, I bet all labs releasing SOTA models are more than happy to remove the main way they make money and let you run it locally, especially if you're a big spender like Uber who seems very willing to throw money into the sea as an experiment.

throwway120385•18m ago

That's going to stop eventually, and I think at that point we're going to see business models more like the major CAD providers.

idiotsecant•11m ago

I don't think they'll have a choice, open weights models are not far behind. At some point it's essentially a commodity game

dkdcdev

LurkandComment•29m ago

1) This happened because they fundementally misunderstand how to use AI and how AI is priced 2) Most organizations are throwing everything in for analyses and not limiting the answer they want. You need to be specific of about what you analyze and what answers you want 3) People undervalue prompting or templated responses. I will have written. validated and sanity checked a prompt several times and run it across several models before I say its ready for use. But when it is, I know what it will give me and that the scope of its research and answer is as close to what I want as it can be. As little excess as I can. This all saves tokens

jkwang•27m ago

The $1500 number is less interesting than the fact that they hit a ceiling at all. Most engineering teams I've talked to have no idea what their AI spend is per developer because it's buried in a consolidated cloud bill. Having a hard cap forces two useful conversations: what workflows actually justify API calls vs local inference, and whether the output is being measured against any real productivity metric. Without that feedback loop it's just a race to see who can burn tokens fastest.

simonw•11m ago

Both the Anthropic and OpenAI "Enterprise" plans include per-developer analytics:

Anthropic: https://support.claude.com/en/articles/12883420-view-usage-a...

OpenAI: https://help.openai.com/en/articles/10875114-workspace-analy...

jedisct1•27m ago

A lot of things can be done with local models.

rimliu•19m ago

Even more things can be done without any models just as well.

dude250711•9m ago

Single developers seeking local models.

cloudking•26m ago

They are also beholden to enterprise pricing and can't use the subsidized consumer max plans.

ilia-a•25m ago

Seems odd limit, especially since it highly dependant on Token provider used, with Opus this is not much and could easily be burnt in a week or less, but with something like deepseek the 1500 can literarily be an annual budget.

That being said, I do have to wonder why someone as bug as say Uber, simply not rollout OSS model in the cloud for their team, I'd imagine that would be cheapest & most flexible option, while also keeping all the data shared with LLM private.

iceman28•23m ago

It’s not just about the model but also setting up the system to create and share compute (GPUs) which is quite complicated on its own. Ubers primary business focus isn’t infrastructure.

epsteingpt•19m ago

Uber engineers reported that loading their workspace and pulling recent commits exhausted that AI limit for Claude Code (4.8 x-high) immediately.

ashahin•16m ago

The $1,500 frames this as a per-engineer ceiling, but the unit of consumption shifted under everyone's feet — engineers don't issue prompts anymore, they kick off agent loops that fan out into 20–100 tool calls and 10–50 LLM calls per task. A single agent run on a non-trivial refactor burns more tokens than the engineer typing for an hour. So the cap doesn't constrain engineers, it constrains agent-task throughput per engineer — which is a different thing. The leaderboard-vs-cap debate misses that the metric worth bounding is $/successful-PR or $/correct-completion, not $/engineer-month; variance between cheap and expensive tasks at the same budget is 10–50x now rather than 2–3x. Per-tool caps eventually force every team to ask: which workflows justify burning through tokens, and which should be cached, retrieved, or templated.

onlyrealcuzzo•13m ago

It's interesting to me how ineffective LLMs are at refactoring, but when you think closely about how they work, it makes sense.

They are good at searching for things that have been done 10,000 times before, and slightly changing them. This is the majority of all "new" features.

Almost nothing is "new"...

Refactors are not this. If you can't just write a gsub to do the work, they need to essentially break it up into N problems to solve, each of them pretty slow and expensive. Sure, none of these problems individually are "new" - which is why they can do it. But they can't do it as effectively as you'd think.

hanzeweiasa•10m ago

Good point about the unit of consumption shifting from prompts to agent loops. That makes pricing even trickier for vertical-specific AI tools.

We see this firsthand building AI Workdeck (open-source AI workspace for legal teams). A single due diligence review might chain 20+ agent calls: OCR -> text extraction -> clause classification -> risk scoring -> evidence chain assembly. The user sees one action, but the backend burns through significant inference.

The interesting thing about vertical tools is the pricing model can be fundamentally different. Horizontal tools charge per seat or per token. But in legal, the value is in the document, not the seat. A lawyer reviewing a 500-page M&A file gets way more value than one reviewing a 2-page NDA.

Self-hosting changes the calculus too. Our users run on their own infra, so the AI cost is whatever their GPU costs. That makes $1,500/month caps less relevant and throughput optimization more important.

f311a•15m ago

How many more months do we need to wait, until big companies realize that flash models work just fine if you:

1) Don't ask LLMs for big changes

2) Review everything and point them in the right direction

Large models still suck at big changes, they produce questionable architecture and you still have to review the code, if your project is serious enough.

The codebase quickly become a mess, if you don't pay enough attention. Does not matter which model.

So why bother with big models, when flash models are 10x cheaper and much faster to iterate under guidance? Large models can be used for security and bug audits. Flash models work almost the same for changes under 300 LOC when you dictate how you want your code to look.

jwpapi•15m ago

If you estimate 10k salary per engineer that means the moment it’s cheaper for them to hire another engineer but that doesn’t mean it’s improving productivity 15% but if 15% is the moment it stopped being better than another human we can assume 7.5%?

Probably even less because you would spend those 1500 extra per employee also if you just save 10% so 150 per employee that’s 1.5% on salary.

This is imho one of the best ranges we can assume for now how much would that be on the whole swe market?

AI companies have more expenses than inference.

they also already do this…

Anthropic and OpenAI license to the public clouds. Google reportedly licenses to Apple. licensing to Fortune 100 companies running on their own infra is an obvious next step

it is a race to the bottom and I’m not sure the labs win that race. we’ll see!

jcgrillo•23m ago

> WTF did Uber build with all of that spend?

WTF did anyone build with all that spend? Despite all the feel-good anecdotes about how productive folks feel using ai coding tools there's a deafening silence when it comes to actual, demonstrated efficacy. How can we be this far entrenched in these workflows and still not know whether they actually do anything useful?

awesan•19m ago

I can say at least for me at a small-ish company (~40 FTE) there has been a surge in internal productivity tools. Nothing to improve the end user product directly but a lot of tools to make processes easier and less error prone.

What would previously be janky internal dashboards or excel sheets are now actually nice to use tools. That said of course the maintenance cost of all that has yet to be discovered, and the ROI is questionable.

CharlieDigital•15m ago

About the same ~40 FTE team. We're doing the same thing. Smattering of internal tools, but no net gain in external revenue. Who knows which of those tools will have any value or ppl are just doing it because it's cool now to make fancy dashboards.

OK. I guess that's good, too.

jcgrillo•10m ago

Yeah this seems to be a pretty widespread story, from what I've heard as well. The thing about those janky dashboards and spreadsheets though is that somebody understood them and built them with intent to solve a particular problem. Despite the rickety appearance, they're trustworthy tools. A polished single page app might look nicer but it's harder to debug than an excel sheet, and much less transparent in its internal workings--especially if nobody actually wrote it...

RugnirViking•14m ago

Imo its pretty clear that anyone who is taking the issue at least somewhat seriously knows the amount of value they provide is not non-zero. However, the problems are manifold: firstly, toolchains vary wildly, from fancy autocomplete, to engineers chatting with codebases they're unfamiliar with, to people integrating them into devops and infra, to people doing spec driven development, with a thousand philosophies inbetween. Many people suspect that those above them in the ladder are on the cusp of massive failure due to losing track of the code, and many people higher on the ladder think those below them are overly cautious. I hate to be the guy saying "oh it must be somewhere in the middle", but I will say at the very least I like being able to use it to read docs for me, and to synthesize syntax and simple scripts (give me a join that works across these tables and gives me column x, y and z - give me a python script that parses a file like this example and extracts abc data - given this api spec figure out how I can get this data from this endpoint, go)

as for complex software, the art of that is not simply chaining together such scripts. Its the art of using architecture and testing to shape uncertainty, and developing requirements (and extrapolating sensibly from incomplete requirements). I don't think llms are great at this, but they arent terrible either. A lot of the more active users in the space are doing stuff where theyve realised they need more detailed specs, which like, yeah, we knew this already - better defined problems lead to better software.

nonethewiser•10m ago

The real answer?

Software engineer quality of life.

There can be an increase in productivity without a corresponding increase in total output. The gains could be captured by software engineers doing a days work in an hour then fucking off in a variety of ways.

darkwater•20m ago

> it's WTF did Uber build with all of that spend?

You can ask the same for the median 330k salary in the US for Uber Engineering... and being a bit snarky, attending Uber engineers talks here and there at a few conferences, looks like. they love to (re)invent internal tooling/platforms. That's pretty expensive on its own.

EDIT: I'm not saying that Uber's engineers didn't add value to the company, they absolutely did and handling the scale up they had to handle is not an easy feat. But I do challenge the notion of "what features did they create with that (LLM) spending?" of GP.

CharlieDigital•12m ago

This is what all "platform engineers" have to do once things are working nicely: you have to keep inventing work.

I don't know; I'm a Ron Popeil "set it and forget it" kind of guy. Make the dumbest, simplest thing that's going to work with some clear path for scaling. Then go do valuable things instead.

darkwater•4m ago

But most Platform Engineering teams in smaller companies (and especially non-US) add a layer on top of existing technologies. A layer that usually maps to the specific culture and idiosyncrasies of that company; a bit like the deployment flow which is usually very specifically shaped on how a company is.

But in Uber's case, they tend to reinvent lower level pieces of platform/infra.

throwaw12•10m ago

you don't get promotion for supporting existing things, but for "inventing" you can get promoted. also for large migrations

sourcecodeplz•20m ago

$1.5kpm for SOTA. 128gb you run DSV4 Flash.

jvanderbot•19m ago

Right - the future of LLMs is like ol' windows XP+Dell. Commercialized "things" you run locally offline, co-designed with hardware, with a known productivity suite, and large businesses building the next generation thing and suite with 18mo release cycles (ish).

nonethewiser•12m ago

XP? I can see the argument for enterprise support but in that case the latest windows OS is going to be virtually free and I dont know if MS and Dell etc. would even support an XP machine. Might even be required for hardware. If no enterprise support wouldnt Linux make a lot more sense?

I get that if it's offline the security downside of XP doesnt matter, and I assume XP is free, but being free doesnt really seem that valuable compared to alternatives (free linux and virtually free OS if buying wholesale).

jvanderbot•4m ago

"Windows XP+Dell" should have been in quotes. It's similar to the way enterprise productivity software was developed, packaged co-designed with hardware, and sold on an 18mo upgrade cycle assumption. It's not literally windows xp.

ungreased0675•15m ago

Your last question is really important. What did they accomplish with all that spend?

I suspect there’s some mass delusion with respect to actual accomplishments as a result of LLM use. Sure, things are moving faster, but does it matter?

devttyeu•14m ago

If you believe a 128gb machine that is essentially DGX Spark in a laptop chassis can run models comparable to SOTA you either never ran open models on hard tasks, or you aren't scratching the surface of SOTA closed LLM capability in how you're using them.

f311a•2m ago

Can you show me an example of a hard task that can't be achieved using light models? When we don't want the model to work on autopilot without reviewing the code at all. Even SOTA models will produce garbage code, if you don't guide them all the time.

Hard tasks require a lot of guidance and code reviewing, unless you are creating another throw away project where correctness does not matter.

m3kw9•14m ago

You can't get an edge using local models, these guys may have competitors that will spend on SOTA models. They won't likely ever consider local machines even for some offloading scenarios, the complexity and costs will be even higher.

CharlieDigital•8m ago

Consider rewiring your perspective: getting an edge doesn't really matter; the only thing that matters is will customers pay for this? Is this a useful, valuable problem to solve?

Coding faster doesn't really solve that.

Uber makes more money if people buy more rides, order more food, have some breakthrough in autonomous driving. They can save money if they can optimize some ops or spend somewhere. Is there any evidence that with the spend on AI that they achieved any of this? If they did, I'm sure we'd hear about it in some engineering blog.

Show HN: Lint Your Markdown with ESLint

A weekend benchmarking Copilot CLI's /security-review across 5 LLMs

Aura: Action-Gated Memory for Robot Policies at Constant VRAM

Security advisory: Brute force attack on Dashlane user accounts

Acer working to patch max severity zero-days in Wave 7 routers

Book of Cron Job

The Conductor Rewrite: What They Changed to Make It Fast

Natural tissue immortality: Indefinite survival of sea cucumber explants

Software engineering at the tipping point [video]

The French Have the Quantum Circuits

Trader – LLM agent for Robinhood with a Rust safety layer and paper trading

What if we didn't have so much political junk mail?

Optimize Images, CSS, JavaScript and PDFs and Get Temporary CDN URLs

SpectraLoRA: GNN -> Raman

Extract More Kernel Performance with Nvidia CompileIQ Auto-Tuning

Tiered Network Buffer Architecture for Fast Networking in Chiplet-Based CPU

Would you pay once (no subscription) for prebuilt Claude Code agents?

Malvertising Campaign Spoofs GitHub to Deliver macOS Trojans

I built a ceiling projection mapping of the planes flying over my house

Good Careers at Bad Companies

Has Apple Lost the Plot with Final Cut Pro?

Build a Basic AI Agent from Scratch: Tools

Hebbia Imprint

Unusual Whales

Fixing Slow Dependabot Actions in Go Projects

The AI pricing conundrum – it started as a nightmare, now it's worse

Native-first Markdown editor: Works like a standard textarea

Show HN: Curatube: a distraction free interface for YT playlists

There are only four skills: design, technical, management and physical

Top Sanity CMS Agencies to Choose for Your Next Project