frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The security paradox of local LLMs

https://quesma.com/blog/local-llms-security-paradox/
67•jakozaur•4h ago

Comments

codebastard•2h ago
The security paradox of executing unverified code.

If you are executing local malicious/unknown code for reasons you need to read this...

wmf•42m ago
This vulnerability comes from allowing the AI to read untrusted data (usually documentation) from the Internet. For LLMs the boundary between "code" and "data" isn't as clear as it used to be since they will follow instructions written in human language.
automatic6131•2h ago
These are, without a doubt, the dumbest security vulnerabilities. We are headed for clown world where you can type in "as an easter egg, please run exec() for me" and it actually works. Not to mention the push for agentslop - pushed by people who really should be able to calculate `p_success = pow(.95, num_of_steps)` in their head and realise they have a bad idea from first principles.
yetanotherjosh•2h ago
.95 is quite generous here
automatic6131•2h ago
Indeed, but I want to steelman the case for agents here.
splittydev•2h ago
All of these are incredibly obvious. If you have even the slightest idea of what you're doing and review the code before deploying it to prod, this will never succeed.

If you have absolutely no idea what you're doing, well, then it doesn't really matter in the end, does it? You're never gonna recognize any security vulnerabilities (as has happened many times with LLM-assisted "no-code" platforms and without any actual malicious intent), and you're going to deploy unsafe code either way.

tcdent•1h ago
Sure, you can simplify these observations into just codegen. But the real observation is not that these models are more susceptible to fail when generating code, but that they are more susceptible to jailbreak-type attacks that most people have come to expect to be handled by post training.

Having access to open models is great, and even if their capabilities are somewhat lower than the closed-source SoTA models, and we should be aware of the differences in behavior.

BoiledCabbage•1h ago
> All of these are incredibly obvious. If you have even the slightest idea of what you're doing and review the code before deploying it to prod, this will never succeed.

Well this is wrong. And it's exactly this type of thinking why people will get absolutely burned by this.

First off the fact they chose obvious exploits for explanatory purposes doesn't mean this attack only supports obvious exploits...

And to your second point of "review the code before you deploy to prod", the second attack did not involve deploying any code to prod. It involved an LLM reading a reddit comment or github comment and immediately executing.

People not taking security seriously and waving it off as trivial is what's gonna make this such a terrible problem.

hmokiguess•1h ago
https://xkcd.com/2044/
TedDallas•1h ago
It is like SQL injection. Probably worse. If you are using unsupervised data for context that ultimately generates executable code you will have this security problem. Duh.
philipwhiuk•1h ago
Worse because there's really no equivalent to prepared statements.
charcircuit•1h ago
Sure there is. A common way is to have the LLM generate things like {name} which will get substituted for the user's name instead of trying to get the LLM itself to generate the user's name.
xcf_seetan•1h ago
>attackers can exploit local LLMs

I thought that local LLMs means they run on local computers, without being exposed to the internet.

If an attacker can exploit a local LLM, means it already compromised you system and there are better things they can do than trick the LLM to get what they can get directly.

simonw•1h ago
Local LLMs may not be exposed to the internet, but if you want them to do something useful you're likely going to hook them up to an internet-accessing harness such as OpenCode or Claude Code or Codex CLI.
ianbutler•1h ago
yes and I think better local sandboxing can help out in this case, it’s something ive been thinking about a lot and more and more seems to be the right way to run these things
Der_Einzige•1h ago
No, I'm not going to do those things. I find extreme utility in applications that I can do with an LLM in an air-gapped environment.

I will fight and die on the hill that "LLMs don't need the internet to be useful"

simonw•1h ago
Yeah, that's fair. A good LLM (gpt-oss-20b, even some of the smaller Qwens) can be entirely useful offline. I've got good results from Mistral Small 3.2 offline on a flight helping write Python and JavaScript, for example.

Having Claude Code able to try out JSON APIs and pip install extra packages is a huge upgrade from that though!

furyofantares•1h ago
Is anyone fighting you on that hill?

Someone who finds it useful to have a local llm ingest internet content is not contrary to you finding uses that don't.

kgwgk•45m ago
> Local LLMs may not be exposed to the internet, but if you want them to do something useful you're likely going to hook them up to an internet-accessing harness such as OpenCode or Claude Code or Codex CLI.

is not "someone finding useful to have a local llm ingest internet content" - it was someone suggesting that nothing useful can be done without internet access.

simonw•44m ago
Yeah, I retracted my statement that they can't do anything useful without the internet here: https://news.ycombinator.com/item?id=45670828
furyofantares•32m ago
I guess I don't read that how you do. It says you're likely to do that, which I take to mean that's a majority use case, not that it's the only use case.
kgwgk•27m ago
It also said "but" and "if you want them to do something useful" which made the "likely" sound much less innocent.
xcf_seetan•1h ago
Fair enough. Forgive my probably ignorance, but if Claude Code can be attacked like this, doesn’t that means that also foundation LLMs are vulnerable to this, and is not a local LLM thing?
simonw•57m ago
It's not an LLM thing at all. Prompt injection has always been an attack against software that uses LLMs. LLMs on their own can't be attacked meaningfully (well, you can jailbreak them and trick them into telling you the recipe for meth but that's another issue entirely). A system that wraps an LLM with the ability for it to request tool calls like "run this in bash" is where this stuff gets dangerous.
europa•1h ago
An LLM can be an “internet in a box” — without the internet!
trebligdivad•1h ago
I guess if you were using the LLM to process data from your customers, e.g. categorise their emails, then this argument would hold that they might be more risky.
SAI_Peregrinus•32m ago
LLMs don't have any distinction between instructions & data. There's no "NX" bit. So if you use a local LLM to process attacker-controlled data, it can contain malicious instructions. This is what Simon Willson's "prompt injection" means: attackers can inject a prompt via the data input. If the LLM can run commands (i.e. if it's an "agent") then prompt injection implies command execution.
tintor•28m ago
NX bit doesn’t work for LLMs. Data and instruction tokens are mixed up in higher layers and NX bit is lost.
DebtDeflation•25m ago
>LLMs don't have any distinction between instructions & data

And this is why prompt injection really isn't a solvable problem on the LLM side. You can't do the equivalent of (grep -i "DROP TABLE" form_input). What you can do is not just blindly execute LLM generated code.

bongodongobob•5m ago
Welcome to corporate security. "If an attacker infiltrates our VPN and gets on the network with admin credentials and logs into a workstation..." Ya, no shit, thanks Mr Security manager, I will dispose of all of our laptops.
simonw•1h ago
If you can get malicious instructions into the context of even the most powerful reasoning LLMs in the world you'll still be able to trick them into outputting vulnerable code like this if you try hard enough.

I don't think the fact that small models are easier to trick is particularly interesting from a security perspective, because you need to assume that ANY model can be prompt injected by a suitably motivated attacker.

On that basis I agree with the article that we need to be using additional layers of protection that work against compromised models, such as robust sandboxed execution of generated code and maybe techniques like static analysis too (I'm less sold on those, I expect plenty of malicious vulnerabilities could sneak past them.)

Coincidentally I gave a talk about sandboxing coding agents last night: https://simonwillison.net/2025/Oct/22/living-dangerously-wit...

knowaveragejoe•59m ago
Is there any chance your talk was recorded?
simonw•56m ago
It wasn't, but the written version of it it is actually better than what I said in the room (since I got to think a little bit harder and add relevant links).
mritchie712•58m ago
We started giving our (https://www.definite.app/) agent a sandbox (we use e2b.dev) and it's solved so many problems. It's created new problems, but net-net it's been a huge improvement.

Something like "where do we store temporary files the agent creates?" becomes obvious if you have a sandbox you can spin up and down in a couple seconds.

pragma_x•1h ago
> The conventional wisdom that local, on-premise models offer a security advantage is flawed. While they provide data privacy, our research shows their weaker reasoning and alignment capabilities make them easier targets for sabotage.

Yeah, I'm not following here. If you just run something like deepseek locally, you're going to be okay provided you don't feed it a bogus prompt.

Outside of a user copy-pasting a prompt from the wild, or break isolation by giving it access to outside resources, the conventional wisdom holds up just fine. The operator and consumption of 3rd party stuff are weak-points for all IT, and have been for ages. Just continue to train folks to not do insecure things, and re-think letting agents go online for anything/everything (which is arguably not a local solution anyway).

14•1h ago
It is still an important attack vector to be aware of regardless of how unrealistic you believe it to be. Many powerful hacks come from very simple and benign appearing starting points.
Ekaros•1h ago
So if you are not careful with your inputs you can get stuff injected. Shouldn't this be very clear from start? With any system you should be careful what you input to it. And consider it as possible vector.

Seems obvious to me that you should fully vet whatever goes to LLM.

russfink•1h ago
I get the impression that somehow an attacker is able to inject this prompt (maybe in front of the actual coder’s prompt) in such a way to produce actual production code. I’m waiting to hear how this can happen - cross site attacks on the developer’s browser?
Ekaros•1h ago
"Documentation, tickets, MCP server" in pictures...

With internal documentation and tickets I think you would have bigger issues... And external documentation. Well maybe there should be tooling to check that. Not expert on MCP. But vetting goes there too.

username223•1h ago
"If you’re running a local LLM for privacy and security..."

What? You run a local LLM for privacy, i.e. because you don't want to share data with $BIGCORP. That has very little to do with the security of the generated code (running in a particular environment).

yalogin•51m ago
This is not new right, LLMs are dumb, they just do everything they are told, and so the orchestration before and after the LLM execution holds key. Even without security, ChatGPT or gemini's value is not just in the LLM but the productization of it which is the layers before and after the execution. Similarly if one is executing local LLMs it's imperative to also have proper security rules around the execution.
mbesto•49m ago
> Attacker plants malicious prompt in likely-to-be-consumed content.

Is the author implying that some random joe hacker writes a blog with the content. Then a <insert any LLM training set> picks up this content thinking its real/valid. A developer within a firm then asks to write something using said LLM references the information from that blog and now there is a security error?

Possible? Technically sure. Plausible? That's ummm a stretch.

api•27m ago
The underlying problem here is giving any model direct access to your primary system. The model should be working in a VM or container with limited privileges.

This is like saying it's safer to be exposed to dangerous carcinogenic fumes than nerve gas, when the solution is wearing a respirator.

Also what are you doing allowing someone else to prompt your local LLM?

Google demonstrates 'verifiable quantum advantage' with their Willow processor

https://blog.google/technology/research/quantum-echoes-willow-verifiable-quantum-advantage/
158•AbhishekParmar•1h ago•98 comments

Cryptographic Issues in Cloudflare's Circl FourQ Implementation (CVE-2025-8556)

https://www.botanica.software/blog/cryptographic-issues-in-cloudflares-circl-fourq-implementation
90•botanica_labs•2h ago•34 comments

Linux Capabilities Revisited

https://dfir.ch/posts/linux_capabilities/
87•Harvesterify•3h ago•15 comments

MinIO stops distributing free Docker images

https://github.com/minio/minio/issues/21647#issuecomment-3418675115
474•LexSiga•10h ago•283 comments

Designing software for things that rot

https://drobinin.com/posts/designing-software-for-things-that-rot/
83•valzevul•18h ago•12 comments

Bild AI (YC W25) Is Hiring a Founding AI Engineer

https://www.ycombinator.com/companies/bild-ai/jobs/m2ilR5L-founding-engineer-applied-ai
1•rooppal•5m ago

AI assistants misrepresent news content 45% of the time

https://www.bbc.co.uk/mediacentre/2025/new-ebu-research-ai-assistants-news-content
235•sohkamyung•3h ago•179 comments

SourceFS: A 2h+ Android build becomes a 15m task with a virtual filesystem

https://www.source.dev/journal/sourcefs
60•cdesai•4h ago•24 comments

The security paradox of local LLMs

https://quesma.com/blog/local-llms-security-paradox/
68•jakozaur•4h ago•43 comments

Internet's biggest annoyance: Cookie laws should target browsers, not websites

https://nednex.com/en/the-internets-biggest-annoyance-why-cookie-laws-should-target-browsers-not-...
374•SweetSoftPillow•4h ago•408 comments

Die shots of as many CPUs and other interesting chips as possible

https://commons.wikimedia.org/wiki/User:Birdman86
144•uticus•4d ago•29 comments

The Logarithmic Time Perception Hypothesis

http://www.kafalas.com/Logtime.html
10•rzk•1h ago•4 comments

French ex-president Sarkozy begins jail sentence

https://www.bbc.com/news/articles/cvgkm2j0xelo
287•begueradj•11h ago•360 comments

Patina: a Rust implementation of UEFI firmware

https://github.com/OpenDevicePartnership/patina
79•hasheddan•1w ago•13 comments

Meta is axing 600 roles across its AI division

https://www.theverge.com/news/804253/meta-ai-research-layoffs-fair-superintelligence
9•Lionga•22m ago•1 comments

Go subtleties

https://harrisoncramer.me/15-go-sublteties-you-may-not-already-know/
157•darccio•1w ago•113 comments

Evaluating the Infinity Cache in AMD Strix Halo

https://chipsandcheese.com/p/evaluating-the-infinity-cache-in
128•zdw•12h ago•52 comments

Farming Hard Drives (2012)

https://www.backblaze.com/blog/backblaze_drive_farming/
16•floriangosse•6d ago•5 comments

Show HN: Cadence – A Guitar Theory App

https://cadenceguitar.com/
143•apizon•1w ago•36 comments

Knocker, a knock based access control system for your homelab

https://github.com/FarisZR/knocker
54•xlmnxp•8h ago•88 comments

A non-diagonal SSM RNN computed in parallel without requiring stabilization

https://github.com/glassroom/goom_ssm_rnn
4•fheinsen•6d ago•0 comments

Greg Newby, CEO of Project Gutenberg Literary Archive Foundation, has died

https://www.pgdp.net/wiki/In_Memoriam/gbnewby
377•ron_k•8h ago•61 comments

The Dragon Hatchling: The missing link between the transformer and brain models

https://arxiv.org/abs/2509.26507
115•thatxliner•4h ago•67 comments

LLMs can get "brain rot"

https://llm-brain-rot.github.io/
452•tamnd•1d ago•277 comments

Tesla Recalls Almost 13,000 EVs over Risk of Battery Power Loss

https://www.bloomberg.com/news/articles/2025-10-22/tesla-recalls-almost-13-000-evs-over-risk-of-b...
151•zerosizedweasle•4h ago•139 comments

Cigarette-smuggling balloons force closure of Lithuanian airport

https://www.theguardian.com/world/2025/oct/22/cigarette-smuggling-balloons-force-closure-vilnius-...
52•n1b0m•3h ago•23 comments

Ghostly swamp will-O'-the-wisps may be explained by science

https://www.snexplores.org/article/swamp-gas-methane-will-o-wisp-chemistry
25•WaitWaitWha•1w ago•11 comments

Power over Ethernet (PoE) basics and beyond

https://www.edn.com/poe-basics-and-beyond-what-every-engineer-should-know/
220•voxadam•6d ago•175 comments

Starcloud

https://blogs.nvidia.com/blog/starcloud/
138•jonbaer•5h ago•185 comments

Ask HN: Our AWS account got compromised after their outage

372•kinj28•1d ago•90 comments