How much slower is random access, really?

https://samestep.com/blog/random-access/

42•sestep•3d ago

Comments

Adhyyan1252•4h ago

Love this analysis! Was expecting random to be much slower. 4x is not bad at all

andersa•3h ago

Note this is not true random access in the manner it occurs in most programs. By having a contiguous array of indices to look at, that array can be prefetched as it goes, and speculative execution will take care of loading many upcoming indices of the target array in parallel.

A more interesting example might be if each slot in the target array has the next index to go to in addition to the value, then you will introduce a dependency chain preventing this from happening.

jiggawatts•2h ago

This is why array random access and linked-list random access have wildly different performance characteristics.

Another thing I noticed is that the spike on the left hand side of his graphs is the overhead of file access.

Without this overhead, small array random access should have a lot better per-element cost.

wtallis•25m ago

> A more interesting example might be if each slot in the target array has the next index to go to in addition to the value, then you will introduce a dependency chain preventing this from happening.

However, on some processors there's a data-dependent prefetcher that will notice the pointer-like value and start prefetching that address before the CPU requests it.

forrestthewoods•3h ago

Here’s an older blog post of mine on roughly the same topic:

https://www.forrestthewoods.com/blog/memory-bandwidth-napkin...

I’m not sure I agree with the data presentation format. “time per element” doesn’t seem like the right metric.

klank•1h ago

What are your qualms with time per element? I liked it as a metric because it kept the total deviation of results to less than 32 across the entire result set.

Using something like the overall run length would have such large variations making only the shape of the graph particularly useful (to me) less so much the values themselves.

If I was showing a chart like this to "leadership" I'd show with the overall run length. As I'd care more about them realizing the "real world" impact rather than the per unit impact. But this is written for engineers, so I'd expect it to also be focused on per unit impacts for a blog like this.

However, having said all that, I'd love to hear what your reservations are using it as a metric.

alain94040•1h ago

From your blog post:

> Random access from the cache is remarkably quick. It's comparable to sequential RAM performance

That's actually expected once you think about it, it's a natural consequence of prefetching.

porcoda•2h ago

The RandomAccess (or GUPS) benchmark (see: https://ieeexplore.ieee.org/document/4100365) was looking at measuring machines on this kind of workload. In high performance computing this was important for graph calculations and was one of the things the Cray (formerly Tera) MTA machine was particularly good at. I suppose this benchmark wouldn’t be very widely known outside HPC circles.

FpUser•25m ago

I did another type of experiment which evaluates benefits of branch prediction on AMD 9950X on contiguous array with 1,000,000 elements. Calculated sum adding element if it is bigger than 125 (50% of 256). Difference between random and sorted was 10 times. I guess branch prediction plays a huge role as well.

Andys•18m ago

Thanks for sharing that.

Presumably if you'd split the elements into 16 shares (one for each CPU), summed with 16 threads, and then summed the lot at the end, then random would be faster than sorted?

AlphaGenome: AI for better understanding the genome

A lumberjack created more than 200 sculptures in Wisconsin's Northwoods

Launch HN: Issen (YC F24) – Personal AI language tutor

The time is right for a DOM templating API

Alternative Layout System

Kea 3.0, our first LTS version

How much slower is random access, really?

Collections: Nitpicking Gladiator's Iconic Opening Battle, Part I

Fault Tolerant Llama training

Show HN: Magnitude – Open-source AI browser automation framework

Dickinson's Dresses on the Moon

Thomas Aquinas – The world is divine

Snow - Classic Macintosh emulator

A new pyramid-like shape always lands the same side up

A Review of Aerospike Nozzles: Current Trends in Aerospace Applications

Puerto Rico's Solar Microgrids Beat Blackout

Matrix v1.15

Introducing Gemma 3n

Show HN: I built an AI dataset generator

SigNoz (YC W21, Open Source Datadog) Is Hiring DevRel Engineers (Remote)(US)

Shifts in diatom and dinoflagellate biomass in the North Atlantic over 6 decades

Typr – TUI typing test with a word selection algorithm inspired by keybr

Starcloud can’t put a data centre in space at $8.2M in one Starship

The Business of Betting on Catastrophe

Ambient Garden

Lateralized sleeping positions in domestic cats

Show HN: PRSS Site Creator – Create Blogs and Websites from Your Desktop

Memory safety is table stakes

“My Malformed Bones” – Harry Crews’s Counterlives

Writing a basic Linux device driver when you know nothing about Linux drivers

How much slower is random access, really?

Comments

AlphaGenome: AI for better understanding the genome

A lumberjack created more than 200 sculptures in Wisconsin's Northwoods

Launch HN: Issen (YC F24) – Personal AI language tutor

The time is right for a DOM templating API

Alternative Layout System

Kea 3.0, our first LTS version

How much slower is random access, really?

Collections: Nitpicking Gladiator's Iconic Opening Battle, Part I

Fault Tolerant Llama training

Show HN: Magnitude – Open-source AI browser automation framework

Dickinson's Dresses on the Moon

Thomas Aquinas – The world is divine

Snow - Classic Macintosh emulator

A new pyramid-like shape always lands the same side up

A Review of Aerospike Nozzles: Current Trends in Aerospace Applications

Puerto Rico's Solar Microgrids Beat Blackout

Matrix v1.15

Introducing Gemma 3n

Show HN: I built an AI dataset generator

SigNoz (YC W21, Open Source Datadog) Is Hiring DevRel Engineers (Remote)(US)

Shifts in diatom and dinoflagellate biomass in the North Atlantic over 6 decades

Typr – TUI typing test with a word selection algorithm inspired by keybr

Starcloud can’t put a data centre in space at $8.2M in one Starship

The Business of Betting on Catastrophe

Ambient Garden

Lateralized sleeping positions in domestic cats

Show HN: PRSS Site Creator – Create Blogs and Websites from Your Desktop

Memory safety is table stakes

“My Malformed Bones” – Harry Crews’s Counterlives

Writing a basic Linux device driver when you know nothing about Linux drivers