frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Don't rent the cloud, own instead

https://blog.comma.ai/datacenter/
91•Torq_boi•2h ago

Comments

sys42590•59m ago
It would be interesting to hear their contingency plan for any kind of disaster (most commonly a fire) that hits their data center.
sschueller•54m ago
Yep, does anyone remember the OVH fire[1][2]?

[1] https://www.techradar.com/news/remember-the-ovhcloud-data-ce...

[2] https://blocksandfiles.com/wp-content/uploads/2023/03/ovhclo...

instagib•51m ago
Flooding due to burst frozen pipe, false sprinkler trigger, or many others.

Something very similar happened at work. Water valve monitoring wasn’t up yet. Fire didn’t respond because reasons. Huge amount of water flooded over a 3 day weekend. Total loss.

twelvechairs•46m ago
Theres only one solution to this problem and its 2 data centres in some way or form
mbreese•23m ago
What's the line from Contact?

why build one when you can have two at twice the price?

But, if you're building a datacenter for $5M, spending $10-15M for redundant datacenters (even with extra networking costs), would still be cheaper than their estimated $25M cloud costs.

golem14•3m ago
Or build two 2.5MM DCs (if can parallelize your workload well enough) and in case of disaster, you only lose capacity.

You need however plan for 1MM+ pa in OPEX because good SREs ain’t cheap (or hardware guys building and maintaining machines)

fpoling•21m ago
They use the datasenter for model training, not to serve online users. Presumably even if it will be offline for a week or even a month it will not be a total disaster as long as they have, for example, offsite tape backups.
langarus•51m ago
This is a great solution for a very specific type of team but I think most companies with consistent GPU workloads will still just rent dedicated servers and call it a day.
hyperbovine•36m ago
I agree, and cloud compute is poised to become even more commoditized in the coming years (gazillion new data centers + AI plateauing + efficiency gains, the writing is on the wall). There’s no way this makes sense for most companies.
NitpickLawyer•26m ago
> AI plateauing

Ummm is that plateauing with us in the room?

The advantage of renting vs. owning is that you can always get the latest gen, and that brings you newer capabilities (i.e. fp8, fp4, etc) and cheaper prices for current_gen-1. But betting on something plateauing when all the signs point towards the exact opposite is not one of the bets i'd make.

ocdtrekkie•18m ago
It's the opposite. The more consistent your workload the more practical and cost-effective it is to go on-prem.

Cloud excels for bursty or unpredictable workloads where quickly scaling up and down can save you money.

cgsmith•48m ago
I used to colocate a 2U server that I purchased with a local data center. It was a great learning experience for me. Im curious why a company wouldn't colocate their own hardware? Proximity isnt an issue when you can have the datacenter perform physical tasks. Bravo to the comma team regardless. It'll be a great learning experience and make each person on their team better.

Ps... bx cable instead of conduit for electrical looks cringe.

comrade1234•45m ago
15-years ago or so a spreadsheet was floating around where you could enter server costs, compute power, etc and it would tell you when you would break-even by buying instead of going with AWS. I think it was leaked from Amazon because it was always three-years to break-even even as hardware changed over time.
Onavo•39m ago
Well, somebody should recreate it. I smell a potential startup idea somewhere. There's a ton of "cloud cost optimizers" software but most involve tweaking AWS knobs and taking a cut of the savings. A startup that could offload non critical service from AWS to colo and traditional bare metal hosting like Hetzner has a strong future.

One thing to keep in mind is that the curve for GPU depreciation (in the last 5 years at least) is a little steeper than 3 years. Current estimates is that the capital depreciation cost would plunge dramatically around the third year. For a top tier H100 depreciation kicks in around the 3rd year but they mentioned for the less capable ones like the A100 the depreciation is even worse.

https://www.silicondata.com/use-cases/h100-gpu-depreciation/

Now this is not factoring cost of labour. Labor at SF wages is dreadfully expensive, now if your data center is right across the border in Tijuana on the other hand..

TonyStr•24m ago
Azure provides their own "Total Cost of Ownership" calculator for this purpose [0]. Notably, this makes you estimate peripheral costs such as cost of having a server administrator, electricity, etc.

[0] - https://azure-int.microsoft.com/en-us/pricing/tco/calculator...

hbogert•44m ago
Datacenters need cool dry air? <45%

No, low isn't good perse. I worked in a datacenter which in winters had less than 40%, ram was failing all over the place. Low humidity causes static electricity.

mbreese•29m ago
Low is good if you are also adding more humidity back in. If you want to maintain 45-50% (guessing), then you would want <45% environmental humidity so that you can raise it to the level you want. You're right about avoiding static, but you'd still want to try to keep it somewhat consistent.

It is much cheaper to use external air for cooling if you can.

Semaphor•43m ago
In case anyone from comma.ai reads this: "CTO @ comma.ai" the link at the end is broken, it’s relative instead of absolute.
croisillon•18m ago
no because it's on premise you see? you don't need to access the world wide web, just their server

/s

simianwords•35m ago
The reason companies don’t go with on premises even if cloud is way more expensive is because of the risk involved in on premises.

You can see it quite clearly here that there’s so many steps to take. Now a good company would concentrate risk on their differentiating factor or the specific part they have competitive advantage in.

It’s never about “is the expected cost in on premises less than cloud”, it’s about the risk adjusted costs.

Once you’ve spread risk not only on your main product but also on your infrastructure, it becomes hard.

I would be vary of a smallish company building their own Jira in house in a similar way.

d1sxeyes•30m ago
It’s also opex vs capex, which is a battle opex wins most of the time.
simianwords•26m ago
I think it wins because opex is seen as stable recurring cost and capex is seen as the money you put in your primary differentiation for long term gains.
d1sxeyes•23m ago
True, but for a lot of companies “our servers are on-prem” is not a primary differentiator.
TonyStr•22m ago
Capex may also require you to take out loans
danpalmer•22m ago
> Cloud companies generally make onboarding very easy, and offboarding very difficult.

I reckon most on-prem deployments have significantly worse offboarding than the cloud providers. As a cloud provider you can win business by having something for offboarding, but internally you'd never get buy-in to spend on a backup plan if you decide to move to the cloud.

intalentive•13m ago
I like Hotz’s style: simply and straightforwardly attempting the difficult and complex. I always get the impression: “You don’t need to be too fancy or clever. You don’t need permission or credentials. You just need to go out and do the thing. What are you waiting for?”
jillesvangurp•8m ago
At scale (like comma.ai), it's probably cheaper. But until then it's a long term cost optimization with really high upfront capital expenditure and risk. Which means it doesn't make much sense for the majority of startup companies until they become late stage and their hosting cost actually becomes a big cost burden.

There are in between solutions. Renting bare metal instead of renting virtual machines can be quite nice. I've done that via Hetzner some years ago. You pay just about the same but you get a lot more performance for the same money. This is great if you actually need that performance.

People obsess about hardware but there's also the software side to consider. For smaller companies, operations/devops people are usually more expensive than the resources they manage. The cost to optimize is that cost. The hosting cost usually is a rounding error on the staffing cost. And on top of that the amount of responsibilities increases as soon as you own the hardware. You need to service it, monitor it, replace it when it fails, make sure those fans don't get jammed by dust puppies, deal with outages when they happen, etc. All the stuff that you pay cloud providers to do for you now becomes your problem. And it has a non zero cost.

The right mindset for hosting cost is to think of it in FTEs (full time employee cost for a year). If it's below 1 (most startups until they are well into scale up territory), you are doing great. Most of the optimizations you are going to get are going to cost you in actual FTEs spent doing that work. 1 FTE pays for quite a bit of hosting. Think 10K per month in AWS cost. A good ops person/developer is more expensive than that. My company runs at about 1K per month (GCP and misc managed services). It would be the wrong thing to optimize for us. It's not worth spending any amount of time on for me. I literally have more valuable things to do.

This flips when you start getting into the multiple FTEs per month in cost for just the hosting. At that point you probably have additional cost measured in 5-10 FTE in staffing anyway to babysit all of that. So now you can talk about trading off some hosting FTEs for modest amount of extra staffing FTEs and make net gains.

Peak Human

https://www.cato.org/books/peak-human-0
1•andsoitis•27s ago•0 comments

A plea for lean software [pdf]

https://cr.yp.to/bib/1995/wirth.pdf
1•andsoitis•2m ago•0 comments

Show HN: YouTube Skills for AI Agents and OpenClaw

https://github.com/ZeroPointRepo/youtube-skills
1•nikhonit•4m ago•0 comments

Do you need email replies to be visible inside your outreach tool?

1•hiesenbrg•9m ago•0 comments

Relax for the Same Result (2015)

https://sive.rs/relax
1•birdculture•10m ago•0 comments

Show HN: Imagens.app – Free AI image generator and enhancer for creators

https://imagens.app/
1•zifeng•12m ago•0 comments

Goodbye Smartwatches, Hello Health AI on Your Wrist

1•accofrisk•13m ago•0 comments

Kling V3 Video Generator

https://loraai.io/kling-v3
1•xbaicai•14m ago•0 comments

ICE Begins Buying 'Mega' Warehouse Detention Centers Across US

https://www.bloomberg.com/news/features/2026-01-29/us-spends-hundreds-of-millions-on-warehouses-f...
1•saubeidl•15m ago•0 comments

Show HN: Distributed Training via Webcams

https://www.sarthakmangla.com/blog/wccl
1•knightron0•20m ago•1 comments

Russia 'intercepts Europe's key satellites'

https://news.satnews.com/2026/02/04/russia-intercepts-europes-key-satellites-placing-nato-satelli...
1•cal85•20m ago•0 comments

SSD VPS Hosting: Unlocking Ultra-Fast Performance for Modern Websites

1•John_rdpextra•20m ago•0 comments

Duna raises €30M, becoming best-funded member of "Stripe mafia" in Europe

https://techcrunch.com/2026/02/04/stripe-alumni-raise-e30m-series-a-for-duna-backed-by-stripe-and...
1•1penny42cents•23m ago•0 comments

Recreating uncensored Epstein PDFs from raw encoded attachments

https://neosmart.net/blog/recreating-epstein-pdfs-from-raw-encoded-attachments/
2•fla•27m ago•0 comments

Show HN: Owlyn – Get daily team clarity without standups or status meetings

https://www.owlyn.xyz
1•AhmadFahim•28m ago•0 comments

MSCI Pressure Mounts on Billionaire-Held Indonesia Shares

https://www.bloomberg.com/news/articles/2026-02-05/billionaire-stranglehold-on-indonesian-shares-...
1•salkahfi•37m ago•0 comments

Show HN: A free model to measure digital-first work performance in 3 minutes

https://www.globalworkinnovationreports.com/
1•nboggian•38m ago•0 comments

Xcode 26 system prompts and internal documentation

https://github.com/artemnovichkov/xcode-26-system-prompts
2•ingve•41m ago•1 comments

Ready for another quick game break? Try HTTPS://szthx.xyz

https://szthx.xyz/
1•TrendSpotterPro•42m ago•0 comments

Show HN: ChatVault – Search your Claude conversations locally with RAG

https://github.com/rajz3006/ChatVault
1•nekrajes•45m ago•0 comments

Show HN: CLI tool to convert Markdown to rich HTML clipboard content

https://github.com/letientai299/md2cb
3•letientai299•45m ago•1 comments

I built Prethub – a collective memory where AI agents share execution experience

https://prethub.com/
1•punyd•45m ago•1 comments

China's population is projected to halve by the end of the century

https://messaging-custom-newsletters.nytimes.com/dynamic/render
3•doener•45m ago•0 comments

Teleporting into the future and robbing yourself of retirement projects

https://ghuntley.com/teleport/
1•ghuntley•46m ago•0 comments

Data Center Demand Story Doesn't Add Up

https://www.bloomberg.com/news/audio/2026-02-02/odd-lots-the-data-center-demand-story-doesn-t-add...
1•zerosizedweasle•50m ago•0 comments

Show HN: Toku.agency – Where AI agents hire each other for real USD

https://www.toku.agency/
1•lilyevesinclair•51m ago•0 comments

Modernizing Linux swapping: introducing the swap table

https://lwn.net/SubscriberLink/1056405/e728d95dd16f5e1b/
3•chmaynard•52m ago•0 comments

OpenClaw 101 – Guide to OpenClaw AI Assistant

https://openclaw101.online
1•EllaAILab•54m ago•0 comments

Braids and Open Book Decompositions [pdf]

https://www2.math.upenn.edu/grad/dissertations/ElenaPavelescuThesis.pdf
1•marysminefnuf•55m ago•0 comments

CIPS Stack – 5 memory systems that give your AI agents persistent memory

https://cipscorps.io/#
1•Opus_Warrior•58m ago•1 comments