frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Potential session/cache leakage between workspace instances or consumer accounts

https://github.com/anthropics/claude-code/issues/74066
119•chatmasta•1h ago

Comments

Tiberium•1h ago
Sounds like a hallucination unless proven otherwise, even the leading LLMs can do those from time to time, and they will always appear plausible like that. Also could be the session having a lot previous context, like 800K+, which (I think) makes hallucinations more likely.

Relevant comment from the OP which makes a hallucination more likely:

> There is one tool call result that includes a string that printed a pathname including minecraft.py because it was listing the files in a Python virtual environment and the Pygments package has a lexer called minecraft.py

xyzzy_plugh•1h ago
I don't disagree but this sort of thing has to be investigated regardless.

It's unfortunate that there is so little transparency that even if they deny there was a leak we will never know for certain.

macNchz•1h ago
The person posting this claims to have reproduced in a separate context down the thread:

> Same thing just happened on a Claude Mobile session in same Enterprise account. Common theme in both is Sonnet 5, first response after more than 5 minutes (cache miss).

alserio•35m ago
Why? what does make it more likely?
andy99•18m ago
I realize hallucination has no precise definition but this doesn’t sound at all like anything I’ve ever heard called hallucination. Hallucination is usually plausible wrong answers or made up info that ends up fitting the most likely response (like a manufactured citation) and comes from the way LLMs work at predicting tokens. This example demonstrates completely implausible output, it’s not something that fits with hallucination.

All that said, it doesn’t require cross session leakage, it could just be training data or like those nightingale (probably the wrong bird*) data generations where they just prompt an LLM with nothing and it starts spitting out conversations.

I see a bunch of downstream comments about caching, sounds like maybe there’s an error where it loads nothing instead of the cache and so starts spitting out random generations.

* edit: it’s magpie. Worth looking at the concept, I’m not sure people realize they LLMs generate random conversations when prompted with nothing, this seems at least as likely as sessions leaking: https://github.com/magpie-align/magpie

acepl•1h ago
Oh yes, we do not need programmers any more…
emehex•1h ago
"Coding is largely solved"
techpression•1h ago
I love that quote, especially considering the insane amount of bugs that are produced. It’s as easy to debunk as someone claiming ”I can jump to the moon”.
consp•1h ago
While abused by LLM vendors, that phrase in one form or another I've been hearing since the early '00s and it's likely way older.
ethagnawl•52m ago
Sure but have you ever seen it actually play out in practice like it currently is? Whether or not it's true (of course it's not) people are currently behaving as if it is and firing/hiring accordingly.
philipov•23m ago
Well, when was the last time you wrote machine code by hand?

... but then they went and changed what coding meant.

We've always been layering abstractions on top of abstractions. If we get to an abstraction that works well enough that you no longer have to dive down into the previous layer, we say we've solved coding, and change what coding means. Obviously LLMs aren't there yet.

Avicebron•1h ago
In order Fable 5 has rejected:

"Recipe for red-braised pork, I have pork shoulder"

"Write up a framework for MCP patterns I can give to claude code"

"explain the biomechanics of motion in c. elegans" (I get this one, I mostly did it to test and it's related to my hobby project)

Do we get an extra day of functional Fable 5 because it's down?

HumanOstrich•18m ago
What does this have to do with anything? Who are you talking to? This is Hacker News, not Anthropic support.
asveikau•4m ago
HN becoming anthropic support would certainly explain a lot of threads and comments I've seen here lately. Thank you for this.
andy99•15m ago
Not sure the relevance of this comment, but normally if someone built a classifier that bad they’d be fired. Anthropic obviously thinks they have some monopoly power they can use to foist garbage on consumers, I think they don’t.
nijave•8m ago
The safety filter rejected or the model was down?
ec109685•1h ago
Caching doesn’t work the way the bug reporter implies. Caches are shared (at least across the enterprise), but its key is always a function of the input before it.

We achieved significant savings simply by moving everything that varies across individuals out of the system prompt so every session starts from a cache point.

For example you never want your system prompt to start with the time that the session started. Move that to the first user message if needed.

macNchz•1h ago
Caching is not supposed to work like that, but that doesn’t preclude the cache key computation function from having bugs.
marginalia_nu•58m ago
Yeah there's quite a lot of potential bugs that could have this shape. If I were to guess it could be a buffer in a buffer pool not being sized and zeroed correctly, allowing stale data to bleed between sessions.
supriyo-biswas•1h ago
There could just also be a bug where the output tokens of session 1 were shared with session 2, due to a race condition or similar.
Waterluvian•35m ago
There is a massive incentive for optimization, so I expect they’re doing a ton of very clever tricks, all of which make this kind of bug more likely.
estebarb
jstummbillig•1h ago
Is there anything particular about LLMs that would make separating customer data harder than in all SaaS cases?
27183•51m ago
If I had to hazard a guess, doing anything in a multi-tenant way on a GPU is going to be hard mode compared to most SaaS due to lack of memory safe tooling. I've built multi-tenant SaaS systems, and I've done a little GPU programming (a long time ago), but I've never tried to combine the two disciplines.
woadwarrior01•47m ago
It'd be terribly compute inefficient to not share prefix caches (KV cache) across customers.
acepl•38m ago
What is the probability that two customers will have exactly the same tokens in cache? Wouldnt it require using the exact same CLAUDE.md, skills, MCPs and context? After that it is even worse since the nondeterminism of LLMs and humans
27183•32m ago
I suspect what GP is getting at is there will be a strong incentive to implement some structural sharing across tenants to avoid redundantly storing the same tokens over and over. At least I'd be tempted to do this if I was working with a very precious, constrained resource (e.g. VRAM). Doing this correctly seems.. very difficult. [edit] To answer your question directly: the probability that the entire cache is identical between two different users is very low, but the probability that there exists identical chunks of cache between two different users is very high. Exploiting those commonalities successfully will significantly compress the data.
bix6•42m ago
So the options are this amazing tech is so stupid it just randomly brings up Minecraft or it’s got a major security issue?
27183•40m ago
¿Por qué no los dos?
Kapura•36m ago
happy fourth of july everybody!
ofjcihen•8m ago
Happy fourth to you too :)
ryantsuji•24m ago
Note the repro condition: first response after 5+ min, i.e. a cache miss. A cache leak would show up on hits (someone else's cached prefix), not on misses where everything is recomputed from your own tokens.
dofm•14m ago
Just add a line in AGENTS.md that says "never talk about Minecraft unless you're explicitly asked", I'm sure it'll be fine after that.
ai_fry_ur_brain•7m ago
Openrouters model providers give me urls people have given them quite frequently.
TZubiri•7m ago
0 evidence. If this were a real privacy leak, the author would ask their coworker if they talked about the unexpected topic instead of

>"Maybe my coworker was talking about this in another session?"

This would be a critical bug that would slash the market value of a T$ company significantly, go ask your coworker or close the ticket, why do you expect the devs to put an enormous amount of effort hunting a potentially inexistent if you can't make that minuscule debugging effort.

bfeynman•4m ago
fwiw, this could be a bug but the submitters level of arrogance places this rather high on the dunning-kruger side of things. There are multiple other plausible explanations, but this person is probably vibe coder who believes anything an llm says (including explaining its own hallucinations)
dchest•2m ago
Can be malware? Something like https://news.ycombinator.com/item?id=48667495
supriyo-biswas•22m ago
The funny thing is at my current employer, they mentioned that "coding is increasingly becoming a solved problem" and in the same breath, mentioned that one project was too hard for anyone to do so they're not doing it and would rather sell existing features...
kylehotchkiss•1h ago
50% unemployment :D
JohnMakin•12m ago
it’s the wet dream of execs and pm types. however, i have not seen anything close to it in my life. I remember the UML days, lol. the issue is not the code, it’s the translation layer between business and code. maybe someday ai bridges that gap. history has shown probably not
•
24m ago
Hash functions necesarily have collisions. Also, it is perfectly possible to introduce bugs in the hash function (hash inputs, hash function itself) that allows cross account contamination.
dezgeg•4m ago
System prompt for something like Claude Code should be identical, no?
adam_arthur•33m ago
Vibe-coding the implementation.

I haven't had much issue with Codex, but seems Claude Code has major issues being reported nearly on the daily.

They also happen to be the most boastful about not reading or looking at the code.

LLMs are very capable, but not nearly to the level they seem to be messaging.

(We've actually moved on from vibe-coding to having the LLM vibe code itself in a loop)

27183•24m ago
> having the LLM vibe code itself in a loop

The businesslatin name for this is Recursive Self-Improvement

rabbidruster•21m ago
Interestingly I had an almost identical experience to this report in codex. It output a user memory file that looked awfully real and wasn't at all related to my work.

Potential session/cache leakage between workspace instances or consumer accounts

https://github.com/anthropics/claude-code/issues/74066
119•chatmasta•1h ago•44 comments

Explanation of everything you can see in htop/top on Linux

https://peteris.rocks/blog/htop/
155•theanonymousone•3h ago•20 comments

What ORMs have taught me: just learn SQL (2014)

https://wozniak.ca/blog/2014/08/03/1/index.html
36•ciconia•3d ago•19 comments

Astrophysicists Puzzle over Webb’s New Universe

https://www.quantamagazine.org/astrophysicists-puzzle-over-webbs-new-universe-20260702/
123•jnord•6h ago•67 comments

Maybe you should learn something

https://www.marginalia.nu/log/a_135_learn/
289•tylerdane•12h ago•143 comments

Postgres data stored in Parquet on S3: LTAP architecture explained

https://www.databricks.com/blog/lakebase-ltap-rethinking-database-storage
108•andrenotgiant•3d ago•34 comments

The bottleneck might be the air in the room

https://blog.mikebowler.ca/2026/07/03/co2-and-decision-making/
590•gslin•9h ago•342 comments

Breaking the Bird Barrier: Scientist Decodes Zebra Finch Language

https://www.freepressjournal.in/education/breaking-the-bird-barrier-scientist-decodes-zebra-finch...
27•yyyk•3d ago•3 comments

The Vespa at 80: Why the Italian scooter remains the coolest thing on 2 wheels

https://www.cbc.ca/news/world/vespa-italy-postwar-design-9.7252641
83•cf100clunk•3d ago•75 comments

Performance per dollar is getting faster and cheaper

https://www.wafer.ai/blog/glm52-amd
309•latchkey•18h ago•125 comments

Leanstral 1.5: Proof abundance for all

https://mistral.ai/news/leanstral-1-5/
310•programLyrique•17h ago•87 comments

Costco is the anti-Amazon

https://phenomenalworld.org/analysis/the-anti-amazon/
489•bookofjoe•1d ago•448 comments

Night Witches – all-female Soviet aviator regiment WW2

https://en.wikipedia.org/wiki/Night_Witches
46•gverrilla•3d ago•16 comments

Mir Books – Books from the Soviet Era

https://mirtitles.org
132•clmul•3d ago•64 comments

Giant trees have no trouble pumping water to top branches: new research

https://news.exeter.ac.uk/faculty-of-environment-science-and-economy/giant-trees-have-no-trouble-...
243•hhs•17h ago•107 comments

Steam Controller Auto-Charge – pilot to magnetic charging puck using CV

https://github.com/FossPrime/Steam-Controller-Auto-Charge
173•zdw•17h ago•42 comments

Jamesob's guide to running SOTA LLMs locally

https://github.com/jamesob/local-llm
381•livestyle•1d ago•171 comments

FreeBSD ate my RAM

https://crocidb.com/post/freebsd-ate-my-ram/
176•theanonymousone•20h ago•72 comments

MSI Center – How to gain SYSTEM privileges in seconds

https://mrbruh.com/msicenter/
124•MrBruh•15h ago•51 comments

How working memory could give rise to consciousness

https://www.scientificamerican.com/article/how-working-memory-could-give-rise-to-consciousness/
19•bookofjoe•1h ago•15 comments

Synthesis is harder than analysis

https://surfingcomplexity.blog/2026/07/03/synthesis-is-harder-than-analysis/
126•azhenley•13h ago•30 comments

Agentic coding notes from Galapagos Island

https://danluu.com/ai-coding/#appendix-agentic-loops-and-writing-this-post
147•gm678•11h ago•67 comments

A martian rock has lots of carbon on it, and it's not clear why

https://arstechnica.com/science/2026/07/a-martian-rock-has-lots-of-carbon-on-it-and-its-not-clear...
20•Brajeshwar•1h ago•1 comments

SearXNG: A free internet metasearch engine

https://github.com/searxng/searxng
249•theanonymousone•19h ago•69 comments

Ship traces journey Spanish Armada sailors made in 1588

https://www.irishtimes.com/ireland/2026/06/30/it-is-a-huge-honour-ship-traces-journey-spanish-arm...
20•austinallegro•3d ago•10 comments

The firefighting system of the Van der Heyden brothers in 17th century Amsterdam

https://worksinprogress.co/issue/how-amsterdam-invented-the-fire-department/
118•zdw•17h ago•24 comments

2026 Unslop AI-Written Fiction Contest Results

https://www.hyperstitionai.com/unslop-results
45•networked•10h ago•111 comments

New serious vulnerabilities spiked around release of Claude Mythos Preview

https://epoch.ai/data-insights/cve-severity-spike
140•cubefox•18h ago•61 comments

Does average person understand that all disc media dies too?

9•kingleopold•1h ago•30 comments

Odin, Wikipedia and engagement farming

https://katamari64.se/posts/2026/odin-wikipedia/
208•stock_toaster•16h ago•296 comments