frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
592•klaussilveira•11h ago•176 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
901•xnx•17h ago•545 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
22•helloplanets•4d ago•15 comments

How we made geo joins 400× faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
95•matheusalmeida•1d ago•22 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
28•videotopia•4d ago•0 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
203•isitcontent•11h ago•24 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
199•dmpetrov•12h ago•91 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
313•vecti•13h ago•137 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
353•aktau•18h ago•176 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
355•ostacke•17h ago•92 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
459•todsacerdoti•19h ago•231 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
23•romes•4d ago•3 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
80•quibono•4d ago•18 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
259•eljojo•14h ago•155 comments

Was Benoit Mandelbrot a hedgehog or a fox?

https://arxiv.org/abs/2602.01122
7•bikenaga•3d ago•1 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
392•lstoll•18h ago•266 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
53•kmm•4d ago•3 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
234•i5heu•14h ago•178 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
46•gfortaine•9h ago•13 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
122•SerCe•7h ago•103 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
136•vmatsiiako•16h ago•60 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
68•phreda4•11h ago•12 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
271•surprisetalk•3d ago•37 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
25•gmays•6h ago•7 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1044•cdrnsf•21h ago•431 comments

Zlob.h 100% POSIX and glibc compatible globbing lib that is faste and better

https://github.com/dmtrKovalenko/zlob
13•neogoose•4h ago•9 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
171•limoce•3d ago•91 comments

FORTH? Really!?

https://rescrv.net/w/2026/02/06/associative
60•rescrv•19h ago•22 comments

WebView performance significantly slower than PWA

https://issues.chromium.org/issues/40817676
27•denysonique•8h ago•5 comments

Show HN: Smooth CLI – Token-efficient browser for AI agents

https://docs.smooth.sh/cli/overview
89•antves•1d ago•66 comments
Open in hackernews

GPUHammer: Rowhammer attacks on GPU memories are practical

https://gpuhammer.com/
271•jonbaer•6mo ago

Comments

perching_aix•6mo ago
HW noob here, anyone here has insight on how an issue like this passes EM simulation during development? I understand that modern chips are way too complex for full formal verification, but I'd have thought memory modules would be so highly structurally regular that it might be possible there despite it.
andyferris•6mo ago
I am no expert in the field, but my reading of the original rowhammer issue (and later partial hardware mitigations) was that it was seen as better to design RAM that works fast and is dense and get that to market, than to engineer something provably untamperable with greater tolerances / die size / latency.

GPUs have always been squarely in the "get stuff to consumers ASAP" camp, rather than NASA-like engineering that can withstand cosmic rays and such.

I also presume an EM simulation would be able to spot it, but prior to rowhammer it is also possible no-one ever thought to check for it (or more likely that they'd check the simulation with random or typical data inputs, not a hitherto-unthought-of attack vector, but that doesn't explain more modern hardware).

privatelypublic•6mo ago
I seem to recall that rowhammer was known- but thought impossible for userland code to implement.

This is a huge theme for vulnerabilities. I almost said "modern" but looking back I've seen the cycle (disregard attacks as strictly hypothetical. Get caught unprepared when somebody publishes something making it practical) happen more than a few times.

Palomides•6mo ago
someone did a javascript rowhammer in 2015, hardware that's vulnerable today is just manufacturers and customers deciding they don't want to pay for mitigation

(personally I think all RAM in all devices should be ECC)

grafmax•6mo ago
Manufacturers aren’t held liable for negligence like this. It’s a classic case of economic externality.
andyferris•6mo ago
Yes it is - how would you go about fixing that?
yndoendo•6mo ago
Only means might be cultural. Security conferences such as DefCon or Black Hat create list of insecurely technology that is ubiquitousness and ignored by product designers and OEMs. Vote on ranking their priority and when they should be removed.

News would latch on to "Hacks say all computers without ECC RAM are vulnerable and should not be purchased for their insecurity. Manufacturers like Dell, Asus, Acer, ... are selling products that help hackers steal your information." "DefCon Hackers thank Nvidia for making their jobs easier ..."

Such statements would be refreshed during / after each security conference. There are over 12 conferences a year, about once a month these would be brought back into the public as a reminder. Public might stop purchasing from those manufacturers or choose the secure products to create the change.

andyferris•6mo ago
> manufacturers and customers deciding they don't want to pay

It's more of a tragedy-of-the-commons problem. Consumers don't know what they don't know and manufacturers need to be competitive with respect to each other. Without some kind of oversight (industry standards bodies or goverment regulation), or a level of shaming that breaks through to consumers (or e.g. class action lawsuits that impact manufacturers), no individual has any incentive to change.

progmetaldev•6mo ago
Shame is an underrated way towards pushing for better standards. The problem is getting people in the know, and having them vote with their wallet, or at least public sentiment (social media pressure).
userbinator•6mo ago
The manufacturers tried to sweep it under the rug when the first RowHammer came out. One of the memory testing utilities added tests for it, and then disabled those because they would cause too many failures.
userbinator•6mo ago
We don't want "mitigation", we want true correctness --- or at least the level of perfection achievable before manufacturers thought they could operate with negative data integrity margins and convinced others that it was fine (one popular memory testing utility made RH tests optional and hidden by default, under the reasoning that "too many DIMMs would fail"!) All DRAM generations before DDR2 and early DDR3 didn't have this problem.

RAM that doesn't behave like RAM is not RAM. It's defective. ECC is merely an attempt at fixing something that shouldn't've made it to the market in the first place. AFAIK there is a RH variant that manages to flip bits undetectably even with ECC RAM.

nsteel•6mo ago
> AFAIK there is a RH variant that manages to flip bits undetectably even with ECC RAM.

Single Error Correction, Double Error Detection, Tripple Error Chaos.

ryao•6mo ago
The manufacturers chose this. Most customers were not offered a choice.

It should be considered unethical to sell machines with non-ECC memory in any real volume.

justincormack•6mo ago
You dont have to buy them.
privatelypublic•6mo ago
I'm coming back to note: die shrinks, density increase, and frequency increases- while keeping costs from going out of control all work together to make rowhammer inevitable. I maintain they knew about it, dismissed it as impractical, tested if it was a concern in normal usage... and were surprised + hobbled by their pants, when a PoC hit the public.

I'm not versed in silicon fabrication to know if theres ameliorations involved past what hit the press near 20 years ago now. But, while deep diving modern DRAM for an idea, its shocking how small a change is needed to corrupt a Bit in DRAM.

userbinator•6mo ago
but prior to rowhammer it is also possible no-one ever thought to check for it

It was known as "pattern sensitivity" in the industry for decades, basically ever since the beginning, and considered a blocking defect. Here's a random article from 1989 (don't know why first page is missing, but look at the references): http://web.eecs.umich.edu/~mazum/PAPERS-MAZUM/patternsensiti...

Then some bastards like these came along...

https://research.ece.cmu.edu/safari/thesis/skhan_jobtalk_sli...

...and essentially said "who cares, let someone else be responsible for the imperfections while we can sell more crap", leading to the current mess we're in.

The flash memory industry took a similar dark turn decades ago.

MadnessASAP•6mo ago
Given that I wasnt surprised by the headlie Inhave to imagine that nvidia engineers were also well aware.

Nothing is perfect, everything has its failure conditions. The question is where do you choose to place the bar? Do you want your component to work at 60, 80, or 100C? Do you want it to work in high radiation environments? Do you want it to withstand pathological access patterns?

So in other words, there isnt a sufficent market for GPUs at double the $/GB RAM but are resilient to rowhammer attacks to justify manufacturing them.

wnoise•6mo ago
The idea of pathological RAM access patterns is as ridiculous as the idea of pathological division of floating point numbers. ( https://en.wikipedia.org/wiki/Pentium_FDIV_bug ). The spec of RAM is to be able to store anything in any order, reliably. They failed the spec.
bobmcnamara•6mo ago
> The question is where do you choose to place the bar?

In the datasheet.

thijsr•6mo ago
Rowhammer is an inherent problem to the way we design DRAM. It is a known problem to memory manufacturers that is very hard, if not impossible, to fix. In fact, Rowhammer only becomes worse as the memory density increases.
sroussey•6mo ago
It’s a matter of percentages… not all manufacturers fell to the rowhammer attack.

The positive part of the original rowhammer report was that it gave us a new tool to validate memory (it caused failures much faster than other validation methods).

privatelypublic•6mo ago
This seems predicated on there being significant workloads that split gpu's between tenants for compute purposes.

Anybody have sizable examples? Everything I can think of results in dedicated gpus.

im3w1l•6mo ago
Webgpu api taking screenshot of full desktop maybe?
Buttons840•6mo ago
Do you think WebGPU would be any more of an attack vector than WebGL?
privatelypublic•6mo ago
Rowhammer itself is a write-only attack vector. It can, however, potentially be chained to change the write address to an incorrect region. Haven't dived into details.
SnowflakeOnIce•6mo ago
How is it a write-only attack vector?
privatelypublic•6mo ago
Rowhammer allows you to corrupt/alter memory physically adjacent to memory you have access to. It doesn't let you read the memory you're attacking.

There's PoC's of corrupting memory _that the kernel uses to decide what that process can access_ but the process can't read that memory. It only knows that the kernel says yes where it used to say no. (Assuming it doesn't crash the whole machine first)

SnowflakeOnIce•6mo ago
Suppose you have access to certain memory. If you repeatedly read from that memory, can't you still corrupt/alter the physically adjacent memory you don't have access to? Does it really need to be a write operation you repeatedly perform?
privatelypublic•6mo ago
I probably should have called it "blind" instead.
extraduder_ire•6mo ago
> Does it really need to be a write operation you repeatedly perform?

Yes. The core of rowhammer attacks is in changing the values in RAM repeatedly, creating a magnetic field, which induces a change in the state of nearby cells of memory. Reading memory doesn't do that as far as I know.

vlovich123•6mo ago
Many of the GPU rental companies charge less for shared GPU workloads. So it's a cost/compute tradeoff. It's usually not about the workload itself needing the full GPU unless you really need all the RAM on a single instance.
privatelypublic•6mo ago
Any examples to check out? The only one i know of is vastai... and there's already a list of security issues a mile long there.
diggan•6mo ago
I don't think Vast.ai does "shared GPUs", you can only rent full rigs, at least there is no indication the hardware is shared between multiple users at the same time.

But I think services like Runpod and similar lets you rent "1/6 of a GPU per hour" for example, which would be "shared hosting" basically, as there would be multiple users using the same hardware at the same time.

haiku2077•6mo ago
GKE can share a single GPU between multiple containers in a partitioned or timeshared scheme: https://cloud.google.com/kubernetes-engine/docs/concepts/tim...
privatelypublic•6mo ago
Thats the thing... they're all the same tennant. A GKE node is a VM instance, and GCE doesn't have shared GPUs that I can see.
bluedino•6mo ago
NVIDIA GPU's can run in MIG (Multi-Instance GPU), allowing you to pack more jobs on than you have GPUs. Very common in HPC but I don't about in the cloud.
privatelypublic•6mo ago
I thought about splitting the GPU between workloads, as well terminal server/virtualized desktop situations.

I'd expect all code to be strongly controlled in the former, and reasonably secured in the latter with software/driver level mitigations possible and the fact that corrupting somebody else's desktop with row-hammer doesn't seem like good investment.

As another person mentioned- and maybe it is a wider usage than I thought- cloud gpu compute running custom code seems to be the only useful item. But, I'm having a hard time coming up with a useful scenario. Maybe corrupting a SIEM's analysis & alerting of an ongoing attack?

cyberax•6mo ago
No large cloud hoster (AWS, Google, Azure) shares GPUs between tenants.
shakna•6mo ago
Is that not what AWS is offering here? [0]

"In multi-tenant environments where the goal is to ensure strict isolation."

[0] https://aws.amazon.com/blogs/containers/gpu-sharing-on-amazo...

cyberax•6mo ago
This is for customers. AWS can use virtualization to slice their GPUs across multiple workloads (in their K8s), but AWS itself doesn't share GPUs.
SnowflakeOnIce•6mo ago
Example: A workstation or consumer GPU used both for rendering the desktop and running some GPGPU thing (like a deep neural network)
privatelypublic•6mo ago
Not an issue- thats a single Tennant.

Which is my point.

spockz•6mo ago
Until the GPU is accessible by the browser and any website can execute code on it. Or the attack can come from a different piece of software on your machine.
privatelypublic•6mo ago
Update: I thought for a second I had one: Jupyter notebook services with GPUs- but looking at google colab^* even there its a dedicated GPU for that session.

* random aside: how is colab compute credits having a 90 day expiration legal? I thought california outlawed company-currency expiring? (A la gift cards)

dogma1138•6mo ago
Colab credits aren’t likely a currency equivalent but a service equivalent which is still legal to expire afaik.

Basically Google Colab credits is like buying a seasonal bus pass with X trips or a monthly parking pass with X amount of hours. Rather than getting store cash which can be used for anything.

huntaub•6mo ago
My (limited) understanding was that the industry previously knew that it was unsafe to share GPUs between tenants, which is why the major cloud providers only sell dedicated GPUs.
userbinator•6mo ago
No one really cared about the occasional bitflips in VRAM when GPUs were only used for rendering graphics. It's odd that enabling ECC can reduce performance, unless they mean that's only in the presence of ECC errors being corrected, since AFAIK for CPUs there isn't any difference in speed even when correcting errors.

In a proof-of-concept, we use these bit flips to tamper with a victim’s DNN models and degrade model accuracy from 80% to 0.1%, using a single bit flip

There is a certain irony in doing this to probabilistic models, designed to mimic an inherently error-prone and imprecise reality.

hamandcheese•6mo ago
Available ECC dims are often slower than non-ECC dims. Both slower MT/s and higher latency. At least for "prosumer" ECC UDIMMs which are what I'm familiar with.

So it doesn't seem that wild to me that turning on ECC might require running at lower bandwidth.

ryao•6mo ago
This is incorrect. ECC DIMMs are no slower than regular DIMMs. Instead, they have extra memory and extra memory bandwidth. A 8GB DDR4 ECC DIMM would have 9GB of memory and 9/8 the memory bandwidth. The extra memory is used to store the ECC bits while the extra memory bandwidth is to prevent performance loss when reading/writing ECC alongside the rest of the memory. The memory controller will spend an extra cycle verifying the ECC, which is a negligible performance hit. In reality, there is no noticeable performance difference. However, where you would have 128 traces to a Zen 3 CPU for DDR4 without ECC, you would need 144 traces for DDR4 with ECC.

A similar situation occurs with GDDR6, except Nvidia was too cheap to implement the extra traces and pay for the extra chip, so instead, they emulate ECC using the existing memory and memory bandwidth, rather than adding more memory and memory bandwidth like CPU vendors do. This causes the performance hit when you turn on ECC on most Nvidia cards. The only exception should be the HBM cards, where the HBM includes ECC in the same way it is done on CPU memory, so there should be no real performance difference.

bilegeek•6mo ago
Their second point is wrong (unless the silicon is buggy), but their first point is true. I researched when buying ECC sticks for my rig; nobody that I've found makes unregistered sticks that go above 5600, while some non-ECC sticks are already at 8200, and 6400 is commonplace.

Frustratingly, it's only unregistered that's stuck in limbo; VCC makes a kit of registered 7200.

ryao•6mo ago
That is partly due to artificial reasons and partly due to technical reasons. The artificial reasons would be that the 8200 MT/sec UDIMMs are overclocked. Notice how they run much slower if you do not enable XMP/EXPO, which simultaneously over volts and overclocks them. These exist because a large number of people liked overclocking their memory modules to get better performance. This was unreliable and memory manufacturers noticed that there was a market for a premium product where the overclocking results were guaranteed. Early pre-overclocked modules required people to manually enter the manufacturer provided voltage, frequency and timings into the BIOS, but XMP and later EXPO were made to simplify this process. This idea only took off for non-ECC modules, since the market for ECC UDIMMs wants reliability above all else, so there never was quite the same market opportunity to sell ECC DIMMs that were guaranteed to overclock to a certain level outside of the memory IC maker’s specifications.

There is no technical reason why ECC UDIMMs cannot be overclocked to the same extent and ECC actually makes them better for overclocking since they can detect when overclocking is starting to cause problems. You might notice that the non-ECC UDIMMs have pads and traces for an additional IC that is present on ECC UDIMMs. This should be because the ECC DIMMs and non-ECC DIMMs are made out of the same things. They use the same PCBs and the same chips. The main differences would be whether the extra chips to store ECC are on the module, what the SPD says it is and what the sticker says. There might also be some minor differences in what resistors are populated. Getting back to the topic of overclocking, if you are willing to go back to the days before the premium pre-overclocked kits existed, you will likely find a number of ECC UDIMMs can and will overclock with similar parameters. There is just no guarantee of that.

As for RDIMMs having higher transfer rates, let us consider the differences between a UDIMM, a CUDIMM and a RDIMM. The UDIMM connects directly to the CPU memory controller for the clock, address, control and data signals, while the RDIMM has a register chip that buffers the clock, address and control signals, although the data signals still connect to the memory controller directly. This improves signal integrity and lets more memory ICs be attached to the memory controller. A recent development is the CUDIMM, which is a hybrid of the two. In the CUDIMM, the clock signal is buffered by a Client Clock Driver, which does exactly what the register chip does to the clock signal in RDIMMs. CUDIMM are able to reach higher transfer rates than UDIMMs without overclocking because of the Client Clock Driver, and since RDIMMs also do what CUDIMMs do, they similarly can reach higher transfer rates.

bilegeek•6mo ago
Thanks for the explanation on CUDIMM, I never quite grokked the difference besides it being more stable with two sticks per channel. Hopefully they'll make an ECC CUDIMM at some point, but I'm not holding my breath.
consp•6mo ago
If they don't and you are up for a challenge in bga soldering you can make them yourself if there is pad for the chips. You likely have to buy an extra module to get the chips though.
ryao•6mo ago
This would also need a SPD programmer and possibly some additional SMT resistors, but it is possible in theory.
bobmcnamara•6mo ago
Propagation delay is a thing.

Edit: at some point the memory controller gets a chunk from the lowest level write buffer and needs to compute ECC data before writing everything out to RAM.

Without ECC, that computation time isn't there. The ECC computation is done in parallel in hardware, but it's not free.

ryao•6mo ago
You assume that the ECC is not already calculated when the data is in the cache (and the cache line is marked dirty). Caches in CPUs are often ECC protected, regardless of whether the memory has ECC pritection. The cache should already have the ECC computation done. Writes to ECC memory can simply reuse the existing ECC bytes from the cache, so no additional calculation time is needed on writes. Reads are where additional time is needed in the form of one cycle and if your cache is doing its job, you won’t notice this. If you do notice it, your cache hit rate is close to 0 and your CPU is effectively running around 50MHz due to pipeline stalls.

That said, this is tangential to whether the ECC DIMMs themselves run at lower MT/sec ratings with higher latencies, which was the original discussion. The ECC DIMM is simply memory. It has an extra IC and a wider data bus to accommodate that IC in parallel. The chips run at the same MT/sec as the non-ECC DIMM in parallel. The signals reach the CPU at the same in both ECC DIMMs and non-ECC DIMMs, such that latencies are the same (the ECC verification does use an extra cycle in the CPU, but cache hides this). There are simply more data lanes with ECC DIMMs due to the additional parallelism. This means that there is more memory bandwidth in the ECC DIMM, but that additional memory bandwidth is being used by the ECC bytes, so you never see it in benchmarks.

bobmcnamara•6mo ago
> You assume that the ECC is not already calculated when the data is in the cache (and the cache line is marked dirty).

It was case on the systems I worked with. Integrating it between the cache and memory controller is a great idea though, and it makes sense where you've described it.

> If you do notice it, your cache hit rate is close to 0 and your CPU is effectively running around 50MHz due to pipeline stalls.

Where memory latency hurts for us is ISRs and context switches. The hit rate is temporarily very low, and as you mentioned the IPC suffers greatly.

ryao•6mo ago
> Where memory latency hurts for us is ISRs and context switches. The hit rate is temporarily very low, and as you mentioned the IPC suffers greatly.

While that is true, that is infrequent and having those memory accesses take 151 cycles instead of 150 cycles is not going to make much difference. Note that those are ballpark figures.

bobmcnamara•6mo ago
For DDR4 it's 17 vs 16 memory cycles for the data burst phase.
ryao•6mo ago
If you measure memory access time in CPU cycles, you would see that 150 cycles is the ballpark. 16 cycles would be closer to L2 cache.
tverbeure•6mo ago
The kind of ECC that’s used for register file and memory protection is trivial to compute and completely in the noise in terms of area. It is essentially free.

The reason people say ECC is not free is because it added area for every storage location, not because of the ECC related logic.

bobmcnamara•6mo ago
> It is essentially free.

The cycle cost is often specified in the memory controller manual.

ryao•6mo ago
Nvidia implements ECC in software since they did not want to add the extra memory chip(s) needed to implement it in hardware to their boards. The only case where they do it in hardware is when they use HBM memory.

That said, GDDR7 does on die ECC, which gives immunity to this in its current form. There is no way to get information on corrected bitflips from on-die ECC, but it is better than nothing.

iFire•6mo ago
Does the ECC mode on my 4090 Nvidia rtx stop this?
fc417fc802•6mo ago
Yes, but it reduces performance, and you don't need to care about this because (presumably) you aren't a cloud provider running multi-tenant workloads.

Worst case scenario someone pulls this off using webgl and a website is able to corrupt your VRAM. They can't actually steal anything in that scenario (AFAIK) making it nothing more than a minor inconvenience.

perching_aix•6mo ago
Couldn't it possibly lead to arbitrary code execution on the GPU, with that opening the floodgates towards the rest of the system via DMA, or maybe even enabling the dropping of some payload for the kernel mode GPU driver?
fc417fc802•6mo ago
Only if an attacker can (1) identify a piece of exploitable data that the GPU firmware or kernel mode portion of the driver relies on, (2) ascertain its location at runtime, and (3) obtain a physically adjacent block of memory.

I'm not certain that something which satisfies 1, let alone 3, necessarily exists. On the CPU you flip some bits related to privilege levels. Are any analogous and similarly exploitable data structures maintained by common GPU firmware? And if so, is such data stored in the bulk VRAM?

It wouldn't surprise me if it was possible but it also seems entirely unrealistic given that either you are restricted to an API such as webgl (gl;hf) or you have native access in which case you have better options available to you seeing as the driver stacks are widely rumored to be security swiss cheese (if you doubt this look at how much effort google has invested in sandboxing the "real" GPU driver away from apps on chromeos).

bobmcnamara•6mo ago
(1) go for the root page table protection/ownership bits? These are often predictably located(but don't have to be). But getting access near them probably seems difficult.
keysdev•6mo ago
What about Apple M series?
PeterStuer•6mo ago
I always found hammering attacks to be extremely satisfying, even from a meta-physical pov.

You escape a closed virtual universe by not "breaking out" in the tradidional sense, exploiting some bug in the VM hypervisor's boundary itself, but by directly manipulating the underlying physics of the universe on wich the virtual universe is founded, just by creating a pattern inside the virtual universe itself.

No matter how many virtual digital layers, as long as you can impact the underlying analog substrate this might work.

Makes you dream there could be an equivalent for our own universe?

SandmanDP•6mo ago
> Makes you dream there could be an equivalent for our own universe?

I’ve always considered that to be what’s achieved by the LHC: smashing the fundamental building blocks of our universe together at extreme enough energies to briefly cause ripples through the substrate of said universe

whyowhy3484939•6mo ago
That's assuming there is a substrate that can be disturbed. That's where the parent's analogy breaks down.

As an example of an alternative analogy: think of how many bombs need to explode in your dreams before the "substrate" is "rippled". How big do the bombs need to be? How fast does the "matter" have to "move"? I think "reality" is more along those lines. If there is a substrate - and that's a big if - IMO it's more likely to be something pliable like "consciousness". Not in the least "disturbed" by anything moving in it.

rolisz•6mo ago
A nightmare that makes you wake up screaming? I'd say that counts as disturbing the substrate.
cwillu•6mo ago
It's a pretty exact description: the universe is made of fields, smashing stable excitations of those fields together produces disturbances in other fields (“virtual particles”) that sometimes makes (fleetingly) stable excitations in other fields, which then fall apart through the same dance into different stable excitations than we started with, allowing us to prove that the field in the middle exists and start to determine its properties.

https://profmattstrassler.com/articles-and-posts/particle-ph...

https://profmattstrassler.com/articles-and-posts/particle-ph...

jerf•6mo ago
Another way to think of it. Consider breaking out of Minecraft. Can you do it?

Maybe. There are certainly ways to crash it today. But now let's go through some cycles of fixing those crashes, and we'll run it on a system that can handle the resource usage even if it slows down in the external reality's terms quite a bit. And we'll ignore the slash commands and just stick to the world interactions you can make.

After that, can you forcefully break out of it from the inside?

No.

It is not obligatory for systems to include escape hatches. We're just not great at building complex systems without them. But there's no reason they are necessarily present in all systems.

Another brain bender covering the same idea in a different direction: The current reigning candidate for BB(6) runs an incomprehensible amount of computation [1]. Yet, did it at any point "break out" into our world? Nope. Nor do any of the higher ones. They're completely sealed in their mathematical world, which is fortunate since any of them would sweep aside our entire universe without noticing.

https://scottaaronson.blog/?p=8972

BobaFloutist•6mo ago
I mean sometimes a dream upsets you enough that you wake up
simiones•6mo ago
The LHC doesn't generate anything like the kind of energy that you get when interstellar particles hit the Earth's upper atmosphere, nevermind what's happening inside the sun - and any of these are many, many orders of magnitude below the energies you get in a supernova, for example.

The LHC is extremely impressive from a human engineering perspective, but it's nowhere close to pushing the boundaries of what's going on every second in the universe at large.

robotnikman•6mo ago
The closest thing I can thing of is a black hole.
drcongo•6mo ago
I love that we can switch out LHC for LSD and this comment would still feel perfect.
hoseja•6mo ago
Well for Earth life there are multiple but evolution just learned to exploit them all.
arduinomancer•6mo ago
I tried knocking on my wall 100,000 times and it did indeed cause a disturbance in the neighbouring cell of my apartment

Turns out this whole virtualized house abstraction is a sham

N_Lens•6mo ago
Thanks for the sensible chuckle.
lukan•6mo ago
"I always found hammering attacks to be extremely satisfying"

On a philosophical level I somewhat agree, but on a practical level I am sad as this likely means reduced performance again.

MangoToupe•6mo ago
Only for places where you need security. Many types of computation do not need security.
tuvang•6mo ago
In theory, true. But fixes to issues like this are usually done on hardware level in future generations or very low level software level where most people don’t have the knowledge/effort to deal with. Resulting in our editors/games/job tools running slower they can to mitigate security issues irrelevant to our common use cases.
lukan•6mo ago
Most devices are connected to the internet these days. Anything connected to the internet should be secure.
fc417fc802•6mo ago
If a task has been granted native GPU access then it's already on the inside of the security boundary. Conversely, if you don't trust a task then don't let it access the GPU (aside from passing the entire device through to a virtual machine). This attack doesn't change that reality.
mouse_•6mo ago
> Conversely, if you don't trust a task then don't let it access the (computer)

Wow you just solved all of cyber security

aspenmayer•6mo ago
If you liked that I have some art tips:

To draw an owl, start with two circles or ovals for eyes. Then all you have to do is draw the rest of the owl.

https://knowyourmeme.com/memes/how-to-draw-an-owl

immibis•6mo ago
This is not true. I don't know the details, but GPUs have something similar to page tables, so they can run untrusted tasks. The worst threat is that one could stick in an infinite loop, freezing your display output until it times out - since they don't get timesliced.
fc417fc802•6mo ago
What isn't true? I stand by what I said.

Having page tables (and other security features) isn't mutually exclusive with being horribly insecure in practice. CPUs have certainly had their fair share of vulnerabilities exposed within even just the past few years.

I'll freely admit that I'm going off of what other people have told me. I don't do GPU driver development (or other hardware or the kernel for that matter). But the message I've encountered has been consistent in this regard. If nothing else, ask yourself why google would go to the amount of trouble that they have to develop various GPU sandboxing layers for chromeos apps.

lukan•6mo ago
"The worst threat is that one could stick in an infinite loop, freezing your display output until it times out"

It is not my area of expertise, but since GPUs are increasingly used for calculating things, isn't the main threat rather data leakage or even manipulation of data?

WebGPU is designed to allow computation on the GPU.

fc417fc802•6mo ago
> isn't the main threat rather data leakage or even manipulation of data?

The (IMO fatally flawed) premise here is that the security boundaries enforced by the GPU hardware and driver stack would prevent that. Thus the worst case scenario is a DoS since GPUs somehow still don't seem to be very good at sharing hardware resources in scenarios that involve uncooperative parties.

Note that even without GPGPU workloads there's still the obvious exfiltration target of "framebuffer containing unlocked password manager".

sneak•6mo ago
If we are in a simulation, the number of ways that could be the equivalent of SysRq or control-alt-delete are infinite.

We haven’t even tried many of the simple/basic ones like moving objects at 0.9c.

0cf8612b2e1e•6mo ago
To ruin a great joke, particle accelerators do get up to 0.9999+ the speed of light.
sneak•6mo ago
Ahh, I forgot about those.
BobaFloutist•6mo ago
I will say, I wouldn't personally call individual protons "objects"
amelius•6mo ago
> No matter how many virtual digital layers

So you are saying that a GPU program can find exploits in physics without having access to e.g. high energy physics tools?

Sounds implausible.

mistercow•6mo ago
> Makes you dream there could be an equivalent for our own universe?

But would we even notice? As far as we were concerned, it would just be more physics.

LeifCarrotson•6mo ago
The real question is how we'd know about the physics of the parent universe.

I think this short story is interesting to think about in that way:

https://www.lesswrong.com/posts/5wMcKNAwB6X4mp9og/that-alien...

queuebert•6mo ago
> Makes you dream there could be an equivalent for our own universe?

My idea to attack the simulation is psychological: make our own simulation that then makes its own simulation, and so on all the way down. That will sow doubt in the minds of the simulators that they themselves are a simulation and make them sympathetic to our plight.

nemomarx•6mo ago
What if the simulators tell you they're also doing this? It could be turtles all the way up perhaps
mrkstu•6mo ago
I remember as a kid in the 70’s I first heard about most physical things being empty space.

Walking into a wall a few hundred times may have damaged my forehead almost as much as my trust in science…

sylware•6mo ago
On the general case, that's why some optimized assembly written machine code can be an issue compared to the slow compiler generated machine code (not true all the time of course): if this machine code is 'hammering' memory, it is could happen more likely with the optimized assembly machine code than with the "actually tested" compiler genertade machine code.
saagarjha•6mo ago
No, in fact you can often Rowhammer inside an interpreter if you construct it correctly.
sylware•6mo ago
Point missed: this is not what I said.
saagarjha•6mo ago
It's slightly easier because native code is typically faster but I would not say that it would cause it to be the case where you can only do the attack by handwriting assembly. Rowhammer involves hitting memory locations which is generally easy to do from any context (most languages have arrays…). It's not like you need to do weird branch prediction stuff which might require specific gadgets to be present.
sylware•6mo ago
Fine-grained machine code is likely to build more efficient/successful exploit.

But the goal I guess is to run that in one of the whatng cartel web engines (aka "sneaky" javascript), which are by themselves a security flaw already...