frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: A longitudinal health record built from fragmented medical data

https://myaether.live
1•takmak007•44s ago•0 comments

CoreWeave's $30B Bet on GPU Market Infrastructure

https://davefriedman.substack.com/p/coreweaves-30-billion-bet-on-gpu
1•gmays•12m ago•0 comments

Creating and Hosting a Static Website on Cloudflare for Free

https://benjaminsmallwood.com/blog/creating-and-hosting-a-static-website-on-cloudflare-for-free/
1•bensmallwood•17m ago•1 comments

"The Stanford scam proves America is becoming a nation of grifters"

https://www.thetimes.com/us/news-today/article/students-stanford-grifters-ivy-league-w2g5z768z
1•cwwc•22m ago•0 comments

Elon Musk on Space GPUs, AI, Optimus, and His Manufacturing Method

https://cheekypint.substack.com/p/elon-musk-on-space-gpus-ai-optimus
2•simonebrunozzi•30m ago•0 comments

X (Twitter) is back with a new X API Pay-Per-Use model

https://developer.x.com/
2•eeko_systems•37m ago•0 comments

Zlob.h 100% POSIX and glibc compatible globbing lib that is faste and better

https://github.com/dmtrKovalenko/zlob
2•neogoose•40m ago•1 comments

Show HN: Deterministic signal triangulation using a fixed .72% variance constant

https://github.com/mabrucker85-prog/Project_Lance_Core
2•mav5431•41m ago•1 comments

Scientists Discover Levitating Time Crystals You Can Hold, Defy Newton’s 3rd Law

https://phys.org/news/2026-02-scientists-levitating-crystals.html
3•sizzle•41m ago•0 comments

When Michelangelo Met Titian

https://www.wsj.com/arts-culture/books/michelangelo-titian-review-the-renaissances-odd-couple-e34...
1•keiferski•42m ago•0 comments

Solving NYT Pips with DLX

https://github.com/DonoG/NYTPips4Processing
1•impossiblecode•42m ago•1 comments

Baldur's Gate to be turned into TV series – without the game's developers

https://www.bbc.com/news/articles/c24g457y534o
2•vunderba•43m ago•0 comments

Interview with 'Just use a VPS' bro (OpenClaw version) [video]

https://www.youtube.com/watch?v=40SnEd1RWUU
1•dangtony98•48m ago•0 comments

EchoJEPA: Latent Predictive Foundation Model for Echocardiography

https://github.com/bowang-lab/EchoJEPA
1•euvin•56m ago•0 comments

Disablling Go Telemetry

https://go.dev/doc/telemetry
1•1vuio0pswjnm7•58m ago•0 comments

Effective Nihilism

https://www.effectivenihilism.org/
1•abetusk•1h ago•1 comments

The UK government didn't want you to see this report on ecosystem collapse

https://www.theguardian.com/commentisfree/2026/jan/27/uk-government-report-ecosystem-collapse-foi...
4•pabs3•1h ago•0 comments

No 10 blocks report on impact of rainforest collapse on food prices

https://www.thetimes.com/uk/environment/article/no-10-blocks-report-on-impact-of-rainforest-colla...
2•pabs3•1h ago•0 comments

Seedance 2.0 Is Coming

https://seedance-2.app/
1•Jenny249•1h ago•0 comments

Show HN: Fitspire – a simple 5-minute workout app for busy people (iOS)

https://apps.apple.com/us/app/fitspire-5-minute-workout/id6758784938
1•devavinoth12•1h ago•0 comments

Dexterous robotic hands: 2009 – 2014 – 2025

https://old.reddit.com/r/robotics/comments/1qp7z15/dexterous_robotic_hands_2009_2014_2025/
1•gmays•1h ago•0 comments

Interop 2025: A Year of Convergence

https://webkit.org/blog/17808/interop-2025-review/
1•ksec•1h ago•1 comments

JobArena – Human Intuition vs. Artificial Intelligence

https://www.jobarena.ai/
1•84634E1A607A•1h ago•0 comments

Concept Artists Say Generative AI References Only Make Their Jobs Harder

https://thisweekinvideogames.com/feature/concept-artists-in-games-say-generative-ai-references-on...
1•KittenInABox•1h ago•0 comments

Show HN: PaySentry – Open-source control plane for AI agent payments

https://github.com/mkmkkkkk/paysentry
2•mkyang•1h ago•0 comments

Show HN: Moli P2P – An ephemeral, serverless image gallery (Rust and WebRTC)

https://moli-green.is/
2•ShinyaKoyano•1h ago•1 comments

The Crumbling Workflow Moat: Aggregation Theory's Final Chapter

https://twitter.com/nicbstme/status/2019149771706102022
1•SubiculumCode•1h ago•0 comments

Pax Historia – User and AI powered gaming platform

https://www.ycombinator.com/launches/PMu-pax-historia-user-ai-powered-gaming-platform
2•Osiris30•1h ago•0 comments

Show HN: I built a RAG engine to search Singaporean laws

https://github.com/adityaprasad-sudo/Explore-Singapore
3•ambitious_potat•1h ago•4 comments

Scams, Fraud, and Fake Apps: How to Protect Your Money in a Mobile-First Economy

https://blog.afrowallet.co/en_GB/tiers-app/scams-fraud-and-fake-apps-in-africa
1•jonatask•1h ago•0 comments
Open in hackernews

Ask HN: Why aren't local LLMs used as widely as we expected?

5•briansun•5mo ago
On paper, local LLMs seem like a perfect fit for privacy‑sensitive work: no data leaves the machine, no margin cost, and can access local data. Think law firms, financial agents, or companies where IT bans browser extensions and disallows cloud AI tools on work machines. Given that, I’d expect local models to be everywhere by now—yet they still feel niche.

I’m trying to understand what’s in the way. My hypotheses (and I’d love corrections):

1) People optimize for output quality over privacy. 2) Hardware is far behind. 3) The tool people truly want (e.g., “a trustworthy, local‑only browser extension”) have yet to emerge. 4) No one has informed your lawyer about this—for now. 5) Or: adoption is already happening, just not visible.

It’s possible many teams are quietly using Ollama in daily work, and we just don’t hear about it.

Comments

codeptualize•5mo ago
I think there are two cases:

1. Self hosting

2. Running locally on device

I have tried both, and find myself not using either.

For both the quality is less than the top performing models in my experience. Part of it is the models, part might be the application layer (chatgpt/claude). It would still work for a lot of use cases, but it certainly limits the possibilities.

The other issue is speed. You can run a lot of things even on fairly basic hardware, but the token speed is not great. Obviously you can get better hardware to mitigate that but then the cost goes up significantly.

For self hosting, you need a certain amount of throughput to make it worth it to have GPU's running. If you have spiky usage you are either paying a bunch for idle GPU's or you have horrible cold start times.

Privacy wise: The business/enterprise TOS's of all big model providers give enough privacy guarantees for all or at least most use cases. You can also get your own OpenAI infra on Azure for example, I assume with enough scale you can get even more customized contracts and data controls.

Conclusion: Quality, speed, price, and you are able to use the hosted versions even in privacy sensitive settings.

briansun•5mo ago
Thanks — I agree with your three big pain points: quality vs hosted SOTA, token speed, and economics/utilization.

Have you run into cases where on‑device still makes sense?

1. Data that is contractually/regulatorily prohibited from being sent to any third‑party processor (no exceptions).

2. Very large datasets where throughput can be low (overnights acceptable) but the cost is high for cloud models.

3. Inputs behind a password-wall that hosted assistants/chatgpt/claude can’t reach and can't do agentic things with them.

gobdovan•5mo ago
If you are a company and you want the advantages of a maintained local-like LLM, if your data already lives in AWS, you'll naturally use Bedrock for cost savings. Given most companies are on cloud, it makes sense they won't do a local setup just for the data to just go back on AWS.

For consumers, it actually requires quite powerful systems, and you won't get the same tokens per minute nor the same precision of an online LLM. And online LLMs already have infrastructure in search engine communication and agent-like behavior that simply makes them better for a wider range of tasks.

This covers most people and companies. So it's either local experience is way worse than online (for most practitioners) or that you already have a local-like LLM in the cloud, where everything else of yours already lives. Not much space left for local on my own server/machine.

briansun•5mo ago
Wouldn't it be cool to have a local AI agent? It could access search engines and browse any website through a headless browser.
just_human•5mo ago
Having worked in a (very) privacy-sensitive environment, the quality of the hosted foundation models are still vastly superior to any open weight model for practical tasks. The foundation model companies (OpenAI, Anthropic, etc) are willing to sign deals with enterprises that offer reasonable protections and keep sensitive data secure, so I don't think privacy or security is a reason why enterprises would shift to open weight models.

That said, I think there is a lot of adoption of open weight for cost-sensitive features built into applications. But i'd argue this is due cost, not privacy.

briansun•5mo ago
Thanks for the view from a very privacy‑sensitive environment — agreed that hosted SOTA still leads on broad capability.

Could you share a quick split: which tasks truly require hosted SOTA than open‑weight? I think gpt-oss is quite good for a lot of things.

SMBs can’t get enterprise contracts with OpenAI/Anthropic, so local/open‑weight may be their only viable path — or wait for a hybrid plan.

jaggs•5mo ago
Two reasons?

1. Management 2. Scalability

Running your own local AI takes time, expertise and commitment. Right now the ROI is probably not strong enough to warrant the effort.

Couple this to the fact that it's not clear how much local compute power you need, and it's easy to see why companies are hesitating?

Interestingly enough, there are definitely a number of sectors using local AI with gusto. The financial sector comes to mind.

briansun•5mo ago
Well put. Management overhead + unclear capacity planning kills many pilots.
pmontra•5mo ago
They are still too large to run on a normal laptop. Furthermore there must be room left for doing our job. It's a long way until what we use online will be within the reach of a $2000 laptop, better if a $1000 one. My laptop won't run any of them even at unreasonable speed (actually slowness).
briansun•5mo ago
Totally fair. On a normal laptop you also need headroom to do your actual job, and KV cache + context length can eat that quickly.