frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

The United States Withdraws from UNESCO

https://www.state.gov/releases/office-of-the-spokesperson/2025/07/the-united-states-withdraws-from-the-united-nations-educational-scientific-and-cultural-organization-unesco
105•layer8•38m ago•46 comments

Show HN: The Magic of Code – book about the wonders and weirdness of computation

https://themagicofcode.com/sample/
19•arbesman•2h ago•3 comments

How to Firefox

https://kau.sh/blog/how-to-firefox/
398•Vinnl•3h ago•224 comments

Yt-transcriber – Give a YouTube URL and get a transcription

https://github.com/pmarreck/yt-transcriber
27•Bluestein•56m ago•5 comments

Global hack on Microsoft Sharepoint hits U.S., state agencies, researchers say

https://www.washingtonpost.com/technology/2025/07/20/microsoft-sharepoint-hack/
725•spenvo•1d ago•381 comments

Unexpected inconsistency in records

https://codeblog.jonskeet.uk/2025/07/19/unexpected-inconsistency-in-records/
32•OptionOfT•2d ago•11 comments

Uv: Running a script with dependencies

https://docs.astral.sh/uv/guides/scripts/#running-a-script-with-dependencies
433•Bluestein•15h ago•122 comments

An unprecedented window into how diseases take hold years before symptoms appear

https://www.bloomberg.com/news/articles/2025-07-18/what-scientists-learned-scanning-the-bodies-of-100-000-brits
120•helsinkiandrew•4d ago•64 comments

OSS Rebuild: open-source, Rebuilt to Last

https://security.googleblog.com/2025/07/introducing-oss-rebuild-open-source.html
3•tasn•54m ago•0 comments

Largest piece of Mars on Earth fetches $5.3M at auction

https://apnews.com/article/mars-rock-meteorite-auction-dinosaur-sothebys-01d7ccfc8dc580ad86f8e97a305fc8fa
23•avonmach•3d ago•14 comments

Jujutsu for busy devs

https://maddie.wtf/posts/2025-07-21-jujutsu-for-busy-devs
288•Bogdanp•14h ago•374 comments

The .a file is a relic: Why static archives were a bad idea all along

https://medium.com/@eyal.itkin/the-a-file-is-a-relic-why-static-archives-were-a-bad-idea-all-along-8cd1cf6310c5
45•eyalitki•3d ago•61 comments

What went wrong inside recalled Anker PowerCore 10000 power banks?

https://www.lumafield.com/article/what-went-wrong-inside-these-recalled-power-banks
465•walterbell•20h ago•230 comments

Python audio processing with pedalboard

https://lwn.net/Articles/1027814/
56•sohkamyung•4d ago•8 comments

Complete silence is always hallucinated as "ترجمة نانسي قنقر" in Arabic

https://github.com/openai/whisper/discussions/2608
455•edent•9h ago•220 comments

Kapa.ai (YC S23) is hiring a software engineers (EU remote)

https://www.ycombinator.com/companies/kapa-ai/jobs/JPE2ofG-software-engineer-full-stack
1•emil_sorensen•7h ago

So you think you've awoken ChatGPT

https://www.lesswrong.com/posts/2pkNCvBtK6G6FKoNn/so-you-think-you-ve-awoken-chatgpt
110•firloop•1h ago•64 comments

TrackWeight: Turn your MacBook's trackpad into a digital weighing scale

https://github.com/KrishKrosh/TrackWeight
583•wtcactus•23h ago•141 comments

Nasa’s X-59 quiet supersonic aircraft begins taxi tests

https://www.nasa.gov/image-article/nasas-x-59-quiet-supersonic-aircraft-begins-taxi-tests/
110•rbanffy•3d ago•75 comments

Don't bother parsing: Just use images for RAG

https://www.morphik.ai/blog/stop-parsing-docs
302•Adityav369•21h ago•69 comments

The vibe coder's career path is doomed

https://blog.florianherrengt.com/vibe-coder-career-path.html
7•florianherrengt•1h ago•3 comments

AccountingBench: Evaluating LLMs on real long-horizon business tasks

https://accounting.penrose.com/
503•rickcarlino•21h ago•140 comments

How to Migrate from OpenAI to Cerebrium for Cost-Predictable AI Inference

https://ritza.co/articles/migrate-from-openai-to-cerebrium-with-vllm-for-predictable-inference/
40•sixhobbits•6h ago•26 comments

French petition against return of bee-killing pesticide passes 1M

https://phys.org/news/2025-07-french-petition-bee-pesticide-1mn.html
37•geox•2h ago•3 comments

The Great Unracking: Saying goodbye to the servers at our physical datacenter

https://stackoverflow.blog/2025/07/16/the-great-unracking-saying-goodbye-to-the-servers-at-our-physical-datacenter/
33•treve•3d ago•35 comments

Replit's CEO apologizes after its AI agent wiped a company's code base

https://www.businessinsider.com/replit-ceo-apologizes-ai-coding-tool-delete-company-database-2025-7
122•jgalt212•2h ago•126 comments

Show HN: A rudimentary game engine to build four dimensional VR evironments

https://www.brainpaingames.com/Hypershack.html
30•teemur•2d ago•1 comments

AI comes up with bizarre physics experiments, but they work

https://www.quantamagazine.org/ai-comes-up-with-bizarre-physics-experiments-but-they-work-20250721/
235•pseudolus•13h ago•141 comments

Erlang 28 on GRiSP Nano using only 16 MB

https://www.grisp.org/blog/posts/2025-06-11-grisp-nano-codebeam-sto
185•plainOldText•19h ago•24 comments

New records on Wendelstein 7-X

https://www.iter.org/node/20687/new-records-wendelstein-7-x
233•greesil•23h ago•106 comments
Open in hackernews

How to Migrate from OpenAI to Cerebrium for Cost-Predictable AI Inference

https://ritza.co/articles/migrate-from-openai-to-cerebrium-with-vllm-for-predictable-inference/
40•sixhobbits•6h ago

Comments

amelius•6h ago
How to move from one service that is out of your control to another service that is out of your control.
anonymousDan•6h ago
I don't understand - what do they mean when they say you can run things on your own infrastructure then?
amelius•6h ago
They say "serverless infrastructure", which is something else.
dist-epoch•5h ago
Your own infrastructure in the same sense as your own AWS EC2 machines.
klabb3•3h ago
> your own AWS EC2 machines

Not disagreeing, but this is quite an expression.

Incipient•4h ago
Having the ABILITY to move seamlessly and without significant cost is absolutely critical.

It gives you flexibility if the provider isn't keeping pace with the market and it prevents the provider from jacking prices relative to its competitors.

Vendor lockin is awful. Hypothetically, imagine how stuffed you'd be if your core virtualisation provider jacked prices 500%! You'd be really hurting.

...ohwait.

kristianc•2h ago
You're not really locked in in any meaningful way currently, you just switch the API you're using. Rather like is being demonstrated here.
eloqdata•5h ago
Why? Honestly, there are already tons of Model-as-a-Service (MaaS) platforms out there—big names like AWS Bedrock and Azure AI Foundry, plus a bunch of startups like Groq and fireflies.ai. I’m just not seeing what makes Cerebrium stand out from the crowd.
benterix•4h ago
Well, they are announcing their $8.5m seed round and hope to attract the maximum number of users by giving away $30 in credits.
tomschwiha•5h ago
The "not optimized" self hosted deployment is 3x slower and costs 34x the price using the cheapest GPU / a weak model.

I don't see the point in self hosting unless you deploy a gpu in your own datacenter where you really have control. But that costs usually more for most use cases.

Incipient•4h ago
Is there actually some scale magic that allows the 34x cost saving (over 100x when you include performance), or is it just insane investment allowing these companies to heavily subsidise cost to gain market share?
tomschwiha•4h ago
Calculating without energy costs: The A10 Gpu itself costs 3200$. With a 3 year usage that is 0,002$ per minute. From the blog post the cost per minute is charged at 0,02$, so a premium of 10x. So with energy if you can load the GPU at minimum 15-20% self hosted becomes cheaper. But you need to take care of your own infrastructure.

With larger purchases the GPU prices also drop so that is the scaling logic.

ToucanLoucan•2h ago
> I don't see the point in self hosting unless you deploy a gpu in your own datacenter where you really have control. But that costs usually more for most use cases.

Not wanting to send tons of private data to a company who's foundation is exploiting data it didn't have permission to use?

dabedee•5h ago
This isn't really about cost savings, it's about control. Self-hosting makes sense when you need data privacy, custom fine-tuning, specialized models, or predictable costs at scale. For most use cases requiring GPT-4o-mini quality, you'll pay more for self-hosting until you reach significant volume.
ivape•4h ago
I’m trying to figure out the cost predictability angle here. It seems like they still have a cost per input/output tokens, so how is it any different? Also, do I have to assume one gpu instance will scale automatically as traffic goes up?

LLM pricing is pretty intense if you’re using anything beyond a 8b model, at least that’s what I’m noticing on OpenRouter. 3-4 calls can approach eating up a $1 with bigger models, and certainly on frontier ones.

jameswhitford•4h ago
Serverless setups (like Cerebrium) charge per second the model is running, its not token based.
ivape•4h ago
Ah you’re right, misread the OpenAI/cerbrium pricing config variables.
BoorishBears•3h ago
You're still paying more than the GPU typically costs on an hourly basis to take advantage of their per-second billing... and if you don't have enough utilization to saturate an hourly rental then your users are going to be constantly running into cold starts which tend to be brutal for larger models.

Their A100 80GB is going more than what I pay to rent H100s: if you really want to save money, getting the cheapest hourly rentals possible is the only way you have any hope of saving money vs major providers.

I think people vastly underestimate how much companies like OpenAI can do with inference efficiency between large nodes, large batch sizes, and hyper optimized inference stacks.

ivape•2h ago
I'll echo one of my original concerns, which is how is this supposed to scale? Am I responsible for that?
BoorishBears•44m ago
How is what supposed to scale?

If you mean the serverless GPU offering, typically you set a cap for how many requests a single instance is meant to serve. Past that cap they'll spin up more instances.

But if you mean rentals, scaling is on you. With LLM inference there's a regime where the model responses will slow down on a per-user basis while overall throughput goes up, but eventually you'll run out of headroom and need more servers.

Another reason why generally speaking it's hard to compete with major providers on cost effectiveness.

ivape•31m ago
Past that cap they'll spin up more instances.

Thank you, this is what I wanted to know.

typically you set a cap for how many requests a single instance is meant to serve

If this is on us then we'd have to make sure whatever caps we set beat api providers. I don't know how easy that cap is to figure out.

benterix•4h ago
To people from Cerebrium: why should I use your services when Runpod is cheaper? I mean, why did you decide to set your prices higher than an established company with significant user base?
Sanzig•42m ago
(Not affiliated with Cerebrium, just did a bit of looking into this a little while back).

Runpod outsources much of their infrastructure to small players that own GPUs. They have recently added some requirements on security and reliability (eg: some level of security audit such as SOC 2, has to be hosted in a real DC, has to be in a locked rack), but fundamentally they are leaning on small shops that slap some GPUs in a server at a colocation facility. This personally would make me nervous about any sensitive workloads.

My impression is that Cerebrium either owns their own GPU servers or they're outsourcing to one of the big players. They certainly don't have the "partner program" advertised on their site like Runpod does.

za_mike157•28m ago
Hey! Founder of Cerebrium here.

- Runpod is one of the cheapest but it comes at the price of reliability (critical for businesses) - We have more performant cold start performance with something special launching soon here - Iterating on your application using CPUs/GPUs in the cloud takes just 2–10 seconds, compared to several minutes with Runpod due to Docker push/pull. - Allow you to deploy in multiple regions globally for lower latency and data residency compliance - We provide a lot of software abstractions (fire and forget jobs, websockets, batching, etc) where as Runpod just deploys your docker image. - SOC 2 and GDPR compliant

With that all being said - we are working on optimisations to bring down pricing

Incipient•4h ago
Is this article just saying openai is orders of magnitude cheaper than cerebrium?
jameswhitford•38m ago
It's a demo project using the free tier hardware from Cerebrum, demonstrating how to migrate with a few lines of code from OpenAI. The cost is never going to beat OpenAI on an A10, there are more powerful options available.