frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Hardware Attestation as Monopoly Enabler

https://grapheneos.social/@GrapheneOS/116550899908879585
797•ChuckMcM•6h ago•301 comments

Local AI needs to be the norm

https://unix.foo/posts/local-ai-needs-to-be-norm/
487•cylo•7h ago•239 comments

Running local models on an M4 with 24GB memory

https://jola.dev/posts/running-local-models-on-m4
46•shintoist•1h ago•29 comments

Incident Report: CVE-2024-YIKES

https://nesbitt.io/2026/02/03/incident-report-cve-2024-yikes.html
364•miniBill•6h ago•90 comments

Obsidian plugin was abused to deploy a remote access trojan

https://cyber.netsecops.io/articles/obsidian-plugin-abused-in-campaign-to-deploy-phantom-pulse-rat/
47•cmbailey•2h ago•19 comments

First tunnel element of the Fehmarnbelt Tunnel immersed

https://www.arup.com/en-us/news/first-fehmarnbelt-tunnel-element-lowered/
33•robin_reala•3d ago•8 comments

Guy Goma's Accidental BBC Interview Lives on After 20 Years

https://www.nytimes.com/2026/05/06/business/media/bbc-guy-goma-interview.html
34•nxobject•2d ago•8 comments

PS3 Emulator Devs Politely Ask That People Stop Flooding It with AI PRs

https://kotaku.com/playstation-3-emulator-devs-politely-ask-that-people-stop-flooding-it-with-ai-...
38•stalfosknight•1h ago•10 comments

Ask HN: What are you working on? (May 2026)

116•david927•7h ago•418 comments

Traces Of Humanity

https://tracesofhumanity.org/hello-world/
124•alex77456•7h ago•19 comments

Maryland citizens hit with $2B power grid upgrade for out-of-state AI

https://www.tomshardware.com/tech-industry/artificial-intelligence/maryland-citizens-slapped-with...
115•lemonberry•3h ago•41 comments

I returned to AWS and was reminded why I left

http://fourlightyears.blogspot.com/2026/05/i-returned-to-aws-and-was-reminded-hard.html
644•andrewstuart•1d ago•469 comments

The people preserving the scientific practice of bird banding

https://thenarwhal.ca/bird-banding-ontario/
26•bookofjoe•3d ago•0 comments

Eight More 8-bit Era Microprocessors (2024)

https://thechipletter.substack.com/p/eight-more-8-bit-era-microprocessors
45•klelatti•2d ago•12 comments

Stop MitM on the first SSH connection, on any VPS or cloud provider

https://www.joachimschipper.nl/Stop%20MITM%20on%20the%20first%20SSH%20connection,%20on%20any%20VP...
70•JoachimSchipper•2d ago•43 comments

Why modern parents feel more sleep deprived than our ancestors did

https://www.bbc.com/future/article/20260508-parents-in-ancient-times-felt-less-sleep-deprived-wha...
70•1659447091•3h ago•59 comments

The locals don't know

https://www.quarter--mile.com/The-Locals-Dont-Know
90•herbertl•8h ago•63 comments

Lakebase architecture delivers faster Postgres writes

https://www.databricks.com/blog/how-lakebase-architecture-delivers-5x-faster-postgres-writes
89•sp_from_db•2d ago•25 comments

James Schuyler's Genius

https://yalereview.org/article/james-schuylers-genius
5•Thevet•2d ago•0 comments

What's a mathematician to do? (2010)

https://mathoverflow.net/questions/43690/whats-a-mathematician-to-do
148•ipnon•13h ago•72 comments

Idempotency is easy until the second request is different

https://blog.dochia.dev/blog/idempotency/
275•ludovicianul•3d ago•174 comments

Louis Rossmann offers to pay legal fees for a threatened OrcaSlicer developer

https://www.tomshardware.com/3d-printing/louis-rossmann-tells-3d-printer-maker-bambu-lab-to-go-bl...
456•iancmceachern•9h ago•245 comments

Think Linear Algebra (2023)

https://allendowney.github.io/ThinkLinearAlgebra/index.html
158•tamnd•15h ago•17 comments

Show HN: An index of indie web/blog indexes

https://theindex.fyi
92•rocketpastsix•11h ago•29 comments

Task Paralysis and AI

https://g5t.de/articles/20260510-task-paralysis-and-ai/index.html
190•MrGilbert•18h ago•105 comments

Space Cadet Pinball on Linux

https://brennan.io/2026/05/09/pinball-and-escrow/
310•jandeboevrie•13h ago•103 comments

Walking slower? Your ears, not your knees, might be the problem

https://www.wsj.com/health/wellness/hearing-loss-walking-speed-iphone-study-c53c482a
81•marc__1•1d ago•61 comments

9 Mothers (YC P26) Is Hiring

https://jobs.ashbyhq.com/9-mothers?utm_source=x8pZ4B3P3Q
1•ukd1•12h ago

Spain has become one of Europe’s cheapest power markets

https://janrosenow.substack.com/p/spain-just-became-one-of-europes
140•marc__1•8h ago•113 comments

Shunting-Yard Animation

https://somethingorotherwhatever.com/shunting-yard-animation/
56•s1291•9h ago•16 comments
Open in hackernews

Running local models on an M4 with 24GB memory

https://jola.dev/posts/running-local-models-on-m4
42•shintoist•1h ago

Comments

sbassi•1h ago
A useful data to know about this setup is how many tokens/sec generates.
JBorrow•1h ago
It’s started in TFA
NDlurker•51m ago
You can't expect someone to read 4 paragraphs into an article before commenting
kennywinker•40m ago
@grok is this true?
DrBenCarson•31m ago
Sorry, @grok is offline after declaring himself MechaMussolini earlier today
NBJack•46m ago
I'm puzzled. The M4, as far as I know, doesn't have 24GB. Did the author mean a M40?
spoonyvoid7•43m ago
M4 = M4 Macbook Pro
teaearlgraycold•26m ago
Or Air
sertsa•43m ago
M4 Mac Mini w/24GB sitting right here on my desk.
tra3•42m ago
There’s definitely an option with 24 gigs of ram: https://support.apple.com/en-ca/121552
canpan•40m ago
Recent models (Qwen 3.6 and Gemma) can really do coding locally. Feels like SOTA from maybe a year ago? But you would want about 32-40GB total memory. 24GB is just a bit short of that. A gaming PC with 16GB graphics card and 32GB RAM brings you very close to a usable coding system.
DrBenCarson•32m ago
How are you using that RAM with the GPU?
canpan•29m ago
Llama.cpp with automatic offload to main memory. You can also use Ollama, it is easier, but slower.
ai_fry_ur_brain•22m ago
"Coding system" "can really do coding locally"

Vibe coders out here thinking all software development is solved by because they made an (ugly and unoriginal) dashboard for their SaaS clone and their single column with 3x3 feature card landing page thats identical to every other vibe coders "startup"

sourc3•35m ago
I am running qwen 3.6 9b quantized model on my m4 pro 48gb and it is barely useful to do some basic pi.dev/cc driven development. I think 128gb desktops are the sweet setup to actually get meaningful work done. However, getting your hands on one of these machines is difficult at the moment.

As much fun as it is to run these things locally don’t forget that your time is not free. I am slowly migrating my use cases to openrouter and run the largest qwen model for < $2-3/day with serious use for personal projects.

hparadiz•32m ago
How does it (the openrouter version) compare to ChatGPT 5.5 or Claude Opus 4.6?
sourc3•15m ago
Good enough. It gets 60-70% of the work I need done for a lot less $ (keep in mind I am using these for personal projects that doesn’t generate revenue). If I was using it with the hopes of making money I think I would just use Codex at this point.
carbocation•28m ago
Was the choice of such a small model driven by a desire for high tok/sec? I ask because an m4 pro 48gb machine can run larger models (if model intelligence is the thing that would make it more useful).
sourc3•18m ago
Yes that was my goal. Also noticed a huge performance gain going from ollama to mlx. Your mileage may vary.
elij•21m ago
I'm using the 30b MOE model on same spec with 65k tokens as a sub agent with tooling and it absolutely writes decent code. The dense 9b I agree wasn't great.
sjones671•12m ago
Thanks for saying this. There's so much nonsense out there online about local models being better than Opus 4.7 and the like. It's just not true for regular users.

I have a brand new M5 MacBook Pro - top end with all the specs and I've tried local models and they're barely functional.

BoredPositron•6m ago
Use the small models for small tasks. Like cli auto complete, file sorting, small scripts, config files, setting up tooling, grammar, simple translations there is so much use in them.
nu11ptr•26m ago
Still trying to understand if a Macbook Pro M5 Max with 128GB is likely going to be able to run coding models well enough that I can cancel my Codex, or even go down to the $20/month plan.
guessmyname•12m ago
A 128GiB MacBook Pro in Canada is what, north of CAD $11k after tax? That’s around USD $7k. At $20/month for a cloud AI subscription, you’re looking at almost 30 years of service for the same money.

How long do people realistically expect a laptop to stay competitive with SOTA local models? Especially in a space where model sizes, context windows, and inference requirements keep moving every year.

And even if the hardware lasts, the local experience usually doesn’t. A heavily quantized local model running at tolerable speeds on consumer hardware is still nowhere near frontier hosted models in reasoning, coding, multimodal capability, tool use, or reliability.

The economics just don’t make sense to me unless you specifically need offline inference, privacy guarantees, or low latency for a niche workflow. Otherwise you’re tying up $10k upfront to run an approximation of what you can already access through a subscription that continuously improves over time.

You could literally put the difference into index funds and probably cover the subscription indefinitely from the returns alone, even accounting for gradual price increases.

rtpg•24m ago
What kinda harness do people use with these local models? I am quite happy with the Claude Code permission model and interface in general for coding stuff (For chat-y interfaces I have no real opinion)
nl•22m ago
I think it's useful to be realistic about what you can do with a local model, especially something as small as the 9B the author is using. A 9B model is around the level of Sonnet 3.6 - it can do autocomplete and small functions but it loses track trying to understand large problems.

But the are interesting and fun to play with! I do a LOT of work on local agent harnesses etc, mostly for fun.

My current project is a zero install agent: https://gemma-agent-explainer.nicklothian.com/ - Python, SQL and React all run completely in browser. Gemma E4B is recommended for the best experience!

This is under heavy development, needs Chrome for both HTML5 Filesystem API support and LiteRT (although most Chromium based browsers can be made to work with it)

It's different to most agents because it is zero install: the model runs in the browser using LiteRT/LiteLLM (which gives better performance than Transformers.js), and Filesystem API gives it optional sandbox access to a directory to read from.

It is self documenting - you can ask questions like "How is the system prompt used" in the live help pane and it has access to its own source code.

There's quite a lot there: press "Tour" to see it all.

Will be open source next week.

ai_fry_ur_brain•20m ago
Local model evangalists are the equivalent of toddlers playing with the velcro on their shoes and being endlessly entertained.

I dont mean this about you, you seem to realize its mostly useless, but most the people on HN be acting like all software development can be done by a local model and the end of SOTA is around the corner.

nl•6m ago
I think knowledge is power.

I think that the more people who try local models (especially the larger ones) the better.

I sometimes get the impression that many people claiming that local models are as good as frontier models work in "token poor" environments. If you can't build large-scale programs using at least Opus 4.5+ then it's difficult to compare. They compare something like Qwen 27B with Sonnet and see that it is nearly as good, but miss that the frontier models are a lot better.

That knowledge is power, too.

I personally can help making local models more accessible. I can't make Opus cheaper.

soganess•5m ago
Getting so close to good!

I consider Gemma 4 31B (dense / no MoE), the new baseline for local models. It's obviously worse than the frontier models, but it feels less like a science experiment than any previous local model I’ve run, including GPT OSS 120B and Nemotron Super 120B.

On my M5 Max with 128 GB of RAM and the full 256K context window, I see RAM use spike to about 70 GB, with something like 14 GB of system overhead. A 64 GB Panther Lake machine with the full Arc B390, or a 48 GB Snapdragon X2 Elite machine, could probably run it with a 128K to 256K context window.

Even a few years ago, seeing this kinda performance on a mainstream-ish/plus configuration would have seemed like a pipe dream.