Ask HN: Company is rapidly cutting AI tool spend how to prep team?

3•Snakes3727•45m ago

Company I work for is now rapidly planning to scale down its AI tooling spend. Claude code access is basically getting removed and people are forbidden from using personal plans.

Reasoning is cost apparently our monthly Claude bill has become astronomical for the org. Nearly 3x our saas's cloud spend.

Apparently we are going to get limited access to codex at severely reduced plans.

I have tried some local models such as Kimi, however most are barely functional.

I am very concerned as the expectation of amount of work done is to remain consistent. Ignoring the fact teams have made entire workflows around Claude I am very worried and frustrated.

How can I help my team ease this transition? Are their local models that run well on local machines that only have 16gb ram?

Comments

itg•40m ago

The 16GB of RAM will really limit you, what about trying OpenRouter and using the cheaper models such as Kimi instead of running them locally?

Snakes3727•32m ago

Given our field we cannot really use anything not approved by management. Pretty much if it doesn't leave our machine we can use its just i don't find anything good. We even have some new devs on the macbook neos, and i can't even find anything for them.

I was considering having something run locally within out building but the time when something like that would be avaliable is not near term so i am trying to make the best of what i can do.

baigy•40m ago

Specifically: to explore your opensource options with compute limitations, ask the community at r/LocalLLaMA on reddit. That's where the current SOTA opensource text-to-text models live.

Snakes3727•28m ago

Yeah i was looking there earlier, its just we thankfully mostly have macbooks, but i recently found out new devs are getting the smaller 8gb ram macbooks as well. Which is going to be even more frusturating.

Since my team is mostly remote running LLM on a cluster in the office is not really viable short term.

baigy•21m ago

This is totally going to suck, but here's one option I was just suggested a few mins ago: https://www.reddit.com/r/LocalLLaMA/comments/1th1mqx/comment... For context, I was asking about running anything OpenClaw-friendly on my RTX4060 8GB VRAM. I know yours is a more involved use-case, but there's still some optionality here.

xvxvx•31m ago

My own company hired a young goon of a man to spearhead their AI initiative. Lots of smiles and arrogance from him. Fast forward 2 months and reality has hit. Weekly meetings asking for feedback draw blank stares as employees explain that Claude can’t do shit to help their workload. This kid is starting to sweat. I bet he’ll be gone by the summer. Hilarious.

Snakes3727•26m ago

Unfortunately at my company leads have no insight into employees claude code caps, and no one has ever complained until now. Apparently some people were basically running with insane caps on CC (25k+), if you asked for it you were approved. Which lead to some people doing insane things on CC for no purpose.

baigy•18m ago

Just setting up better SOPs around using AI for coding is going to help them a ton. They can chalk off the sunk cost to a "learning phase", with now being the time to use the lesson learnt to formulate some future-looking standard operating procedures. No need to suddenly go cold-turkey on AI. My 2 cents.

Linux 7.1-rc4: security list "almost unmanageable" from AI bug reports

I don't want my kids using your stupid AI

The 30 Year Game

In Memoriam: Peter G. Neumann (1932-2026)

Standard Chartered to cut roles as AI use increases

Xiaomi YU7 GT Breaking the Nürburgring SUV Lap Record [video]

Mug Shots: A Small Town Noir (2014)

As of April 2026: Iran has destroyed 42 U.S. Military Aircraft in Op: Epic Fury

We Made a World for Bots

Adding Fake Shadows to My Puzzle Game

Causal Video Models Are Data-Efficient Robot Policy Learners

PyTorch Landscape

Replacing My ISP Router with a UniFi Cloud Gateway Max

Codex-Maxxing

Product is not the problem. Your main image might be

SEC to Ready Plan for Trading Crypto Versions of Stocks

The first AI Bulk Upscaling tool for filmmakers and creator pipelines

Proposals Repo, a place for ideas to start their incubation journey

Balancing persistence vs. pivoting – is grit a virtue or wasteful?

Formal proof that agentic AI governance latency can be O(1) instead of O(days)

Ask HN: Company is rapidly cutting AI tool spend how to prep team?

Show HN: Memory Concierge – hotel concierge AI

Using algebra and LLMs to verify a flight-plan bug fix in Lean

Show HN: Hsrs – Type-Safe Haskell Bindings Generator for Rust

Digital Growth Starts Here – Digital Marketing Agency

Apple Silicon costs more than OpenRouter

LLMCap – A proxy that hard-stops LLM API calls when you hit a dollar cap

Frontier models at open source cost – hot new AI Model Router

Active Supply Chain Attack Compromises Antv Packages on NPM

Finnish spy chief warns Europe may never break free from foreign tech

Ask HN: Company is rapidly cutting AI tool spend how to prep team?

Comments

Linux 7.1-rc4: security list "almost unmanageable" from AI bug reports

I don't want my kids using your stupid AI

The 30 Year Game

In Memoriam: Peter G. Neumann (1932-2026)

Standard Chartered to cut roles as AI use increases

Xiaomi YU7 GT Breaking the Nürburgring SUV Lap Record [video]

Mug Shots: A Small Town Noir (2014)

As of April 2026: Iran has destroyed 42 U.S. Military Aircraft in Op: Epic Fury

We Made a World for Bots

Adding Fake Shadows to My Puzzle Game

Causal Video Models Are Data-Efficient Robot Policy Learners

PyTorch Landscape

Replacing My ISP Router with a UniFi Cloud Gateway Max

Codex-Maxxing

Product is not the problem. Your main image might be

SEC to Ready Plan for Trading Crypto Versions of Stocks

The first AI Bulk Upscaling tool for filmmakers and creator pipelines

Proposals Repo, a place for ideas to start their incubation journey

Balancing persistence vs. pivoting – is grit a virtue or wasteful?

Formal proof that agentic AI governance latency can be O(1) instead of O(days)

Ask HN: Company is rapidly cutting AI tool spend how to prep team?

Show HN: Memory Concierge – hotel concierge AI

Using algebra and LLMs to verify a flight-plan bug fix in Lean

Show HN: Hsrs – Type-Safe Haskell Bindings Generator for Rust

Digital Growth Starts Here – Digital Marketing Agency

Apple Silicon costs more than OpenRouter

LLMCap – A proxy that hard-stops LLM API calls when you hit a dollar cap

Frontier models at open source cost – hot new AI Model Router

Active Supply Chain Attack Compromises Antv Packages on NPM

Finnish spy chief warns Europe may never break free from foreign tech