frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: LocalGPT – A local-first AI assistant in Rust with persistent memory

https://github.com/localgpt-app/localgpt
94•yi_wang•3h ago•25 comments

Haskell for all: Beyond agentic coding

https://haskellforall.com/2026/02/beyond-agentic-coding
39•RebelPotato•2h ago•8 comments

SectorC: A C Compiler in 512 bytes (2023)

https://xorvoid.com/sectorc.html
241•valyala•11h ago•46 comments

Speed up responses with fast mode

https://code.claude.com/docs/en/fast-mode
154•surprisetalk•10h ago•150 comments

Software factories and the agentic moment

https://factory.strongdm.ai/
186•mellosouls•13h ago•335 comments

Brookhaven Lab's RHIC concludes 25-year run with final collisions

https://www.hpcwire.com/off-the-wire/brookhaven-labs-rhic-concludes-25-year-run-with-final-collis...
68•gnufx•9h ago•56 comments

Homeland Security Spying on Reddit Users

https://www.kenklippenstein.com/p/homeland-security-spies-on-reddit
12•duxup•55m ago•1 comments

Hoot: Scheme on WebAssembly

https://www.spritely.institute/hoot/
177•AlexeyBrin•16h ago•32 comments

LLMs as the new high level language

https://federicopereiro.com/llm-high/
56•swah•4d ago•98 comments

Stories from 25 Years of Software Development

https://susam.net/twenty-five-years-of-computing.html
164•vinhnx•14h ago•16 comments

Total Surface Area Required to Fuel the World with Solar (2009)

https://landartgenerator.org/blagi/archives/127
9•robtherobber•4d ago•2 comments

First Proof

https://arxiv.org/abs/2602.05192
129•samasblack•13h ago•76 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
306•jesperordrup•21h ago•96 comments

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

https://github.com/Momciloo/fun-with-clip-path
74•momciloo•11h ago•16 comments

Al Lowe on model trains, funny deaths and working with Disney

https://spillhistorie.no/2026/02/06/interview-with-sierra-veteran-al-lowe/
98•thelok•13h ago•22 comments

FDA intends to take action against non-FDA-approved GLP-1 drugs

https://www.fda.gov/news-events/press-announcements/fda-intends-take-action-against-non-fda-appro...
104•randycupertino•6h ago•225 comments

Vouch

https://twitter.com/mitchellh/status/2020252149117313349
43•chwtutha•1h ago•7 comments

Show HN: A luma dependent chroma compression algorithm (image compression)

https://www.bitsnbites.eu/a-spatial-domain-variable-block-size-luma-dependent-chroma-compression-...
37•mbitsnbites•3d ago•4 comments

Show HN: Axiomeer – An open marketplace for AI agents

https://github.com/ujjwalredd/Axiomeer
12•ujjwalreddyks•5d ago•2 comments

Start all of your commands with a comma (2009)

https://rhodesmill.org/brandon/2009/commands-with-comma/
572•theblazehen•3d ago•206 comments

The AI boom is causing shortages everywhere else

https://www.washingtonpost.com/technology/2026/02/07/ai-spending-economy-shortages/
294•1vuio0pswjnm7•17h ago•471 comments

Microsoft account bugs locked me out of Notepad – Are thin clients ruining PCs?

https://www.windowscentral.com/microsoft/windows-11/windows-locked-me-out-of-notepad-is-the-thin-...
135•josephcsible•9h ago•161 comments

I write games in C (yes, C) (2016)

https://jonathanwhiting.com/writing/blog/games_in_c/
184•valyala•11h ago•166 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
229•limoce•4d ago•125 comments

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
900•klaussilveira•1d ago•276 comments

Selection rather than prediction

https://voratiq.com/blog/selection-rather-than-prediction/
30•languid-photic•4d ago•12 comments

Where did all the starships go?

https://www.datawrapper.de/blog/science-fiction-decline
146•speckx•4d ago•228 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
145•videotopia•4d ago•48 comments

The F Word

http://muratbuffalo.blogspot.com/2026/02/friction.html
113•zdw•3d ago•56 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
303•isitcontent•1d ago•39 comments
Open in hackernews

A GPU Calculator That Helps Calculate What GPU to Use

https://calculator.inference.ai/
108•chlobunnee•6mo ago

Comments

chlobunnee•6mo ago
I built a calculator to help researchers and engineers pick the right GPUs for training and inference workloads!

It helps compare GPU options by taking in simple parameters (# of transformer layers, token size, etc) and letting users know which GPUs are compatible + their efficiency for training vs inferencing.

The idea came from talking with ML researchers frustrated by slow cluster queues or wasting money on overkill GPUs.

I'd love feedback on what you feel is missing/confusing!

Some things I'm thinking about incorporating next are >Allowing users to directly compare 2 GPUs and their specs >Allowing users to see whether a fraction of the GPU can complete their workload

I would really appreciate your thoughts/feedback! Thanks!

quotemstr•6mo ago
No sharding? At all?
LorenDB•6mo ago
Where's AMD support? I have a 9070 XT and would love to see it listed on here.
funfunfunction•6mo ago
This is a cheap marketing ploy for a GPU reseller with billboards on highway 101 into SF.
ChadNauseam•6mo ago
Hate those ads. "Inference isn't just a buzzword". Who thought it was? (No comment on whether the linked post is a useful tool, I haven't played with it enough to know)
zargon•6mo ago
The best VRAM calculator I have found is https://apxml.com/tools/vram-calculator. It is much more thorough than this one. For example, it understands different models' attention schemes for correct KV cache size calculation, and supports quantization of both the model and the KV cache. Also, fine-tuning. It has its own limitations, such as only supporting specific models. In practice though, the generic calculators are not very useful because model architectures vary (mainly the KV cache) and end up being way off. (Not sure whether or not it would be better to discuss it separately, but I submitted it at https://news.ycombinator.com/item?id=44677409)
zeroq•6mo ago
This one is indeed much better and it instantly answers my immediate feedback I wanted to leave for the one originally posted, which is - instead of calculating an artificial scenario I would like to state what can I run on the hardware I actually have at hand. Thanks!
jwrallie•6mo ago
Nice! I could have saved so much time downloading models to do trial end error with this.
oktoberpaard•6mo ago
It gives weird results for me. I’m using Qwen3-32B with 32K context length at Q4_K_M, with 8 bit KV cache fully offloaded to 24GB VRAM. According to this calculator this should be impossible by a large margin, yet it’s working for me.

Edit: this might be because I’ve got flash attention enabled in Ollama.

yepyip•6mo ago
Somehow you have to login now, to use it. It wasn't like this a few weeks ago...
mdaniel•6mo ago
That is not my experience, maybe your IP is flagged as hammering their site?
yepyip•6mo ago
Oh, I wasn't aware of this. But how can you hammer a calculator? Yes, I have used it like 50 times, checking how big would be a Q4, or smaller model with different batch sizes and concurrent users. Do you think it is a heavy calculation?
amanzi•6mo ago
I would have liked to see the RTX 5060 Ti with 16GB mentioned. I can't tell if it's omitted because it won't work, or if it's excluded for some other reason?
amatecha•6mo ago
Yeah, weird miss, but maybe just because it came out more recently. It can be used for ~anything a 5070 could be used for, no? Maybe slower, but still.
snvzz•6mo ago
Rather than GPU calculator, this is an NVIDIA calculator.
nodesocket•6mo ago
In case you’ve been living in a cave, Nvidia is the defacto standard for LLM compute.
jakogut•6mo ago
Llama.cpp supports Vulkan, which is supported by all GPU vendors that care about standards and interoperability.

The default should be open and portable APIs, not needlessly furthering a hegemony that is detrimental to us all.

timothyduong•6mo ago
Where's 3090? Or should that fall in the 4090 (24GB VRAM) category?
jjmarr•6mo ago
AMD support?
mdaniel•6mo ago
> 0 Model Available

Who in the world is expected to populate 11 select/text fields with their favorite model data points they just happen to have lying around, only to see an absolutely meaningless "295% Inference" outcome

What a dumpster

kouteiheika•6mo ago
The training memory breakdown is wildly inaccurate.

- No one trains big models in FP32 anymore.

- Gradients can also often be in BF16, and they don't actually have to be stored if you're not using gradient accumulation or if you're accumulating them directly in the optimizer's state.

- 32-bit Adam is silly; if you don't have infinite VRAM there's no reason why you wouldn't want to use 8-bit Adam (or you can go even lower with quantized Muon)

- Activations? They take up memory too, but are not mentioned.

It shows that to train a 3.77B parameter model I need 62GB of VRAM; just to give you some perspective for how overestimated this is: a few weeks back I was training (full fine-tuning, not LoRA) a 14B parameter model on 24GB of VRAM using every trick in the book to lower VRAM usage (to be fair, not all of those tricks are available in publicly available training harnesses, but the point still stands that even with an off-the-shelf training harness you can do a lot better than what this calculator suggests).

fooker•6mo ago
Fine tuning and training are very different beasts.
kouteiheika•6mo ago
No they're not? The process is essentially exactly the same, just with a much lower total FLOPs budget, since if you're not training from scratch then you don't need to train for as long. I can use *exactly* the same code that I used to fine-tune a model to train a new model from scratch; literally the only difference is whether I initialize the initial weights randomly or with an existing model, a couple of hyperparameters (e.g. for training from scratch you want to start at a higher LR), and training for longer.
fooker•6mo ago
No, if you try to train an LLM like you're suggesting:

- you'll get something similar to gpt2.

- To approach the scale of modern LLMs, you'll need about 10x more than all the GPUs in the world.

It's a neat abstraction to consider these the same, but do you think Meta is paying 100M for writing a 15 line script?

kouteiheika•6mo ago
I still don't understand what exactly you are disagreeing with.

Meta is paying the big bucks because to train a big LLM in a reasonable time you need *scale*. But the process itself is the same as full fine-tuning, just scaled up across many GPUs. If I would be patient enough to wait a few years/decades for my single GPU to chug through 15 trillion tokens then I could too train a Llama from scratch (assuming I feed it the same training data).

fooker•6mo ago
> you need scale.

No, training state of the art LLMs is still a bit of alchemy.

We don't understand what works and what doesn't. Meta is paying 100M each to hire AI researchers not because they know how to scale (they aren't bringing GPUs lol), but mainly because they remember what worked and what didn't for training GPT4.

> If I would be patient..

No, you'd spend the time and resources training and end up with something worse than even GPT3.

This is what made Deepseek appear in headlines for two months straight. Plenty of other companies have 100x more resources and are actively trying to have their own LLMs. Including big names like Apple and Oracle. They haven't managed to.

ethan_smith•6mo ago
Great points about training optimizations. For inference, similar dramatic memory reductions are possible through quantization (INT4/INT8) which can reduce VRAM needs by 2-8x compared to FP16, allowing much larger models on consumer GPUs.
daft_pink•6mo ago
It would be really nice if you could import the standard models so we could see what kind of gpu we would need for popular models in the news and on hugging face
amelius•6mo ago
I selected LLama 3 70B, and then it said all the GPUs are insufficient for training :(
nottorp•6mo ago
What GPU to use for what? Witcher 4? Death Stranding?
alkonaut•6mo ago
Save you a click: it’s about AI.
amstan•6mo ago
You're missing any AMD stuff, I can run a quantized deepseek r1 671B on 4 framework desktops, yet it's "insufficient" for 10 Nvidia gpus.