High-performance C++ hash table using grouped SIMD metadata scanning

https://github.com/Cranot/grouped-simd-hashtable

54•rurban•1mo ago

Comments

dana321•1mo ago

Should it be possible in rust?

almostgotcaught•1mo ago

[flagged]

anematode•1mo ago

Does this work in WebAssembly?

publicdebates•1mo ago

Nice to see people focusing on efficiency instead of web/electron bloat.

conradludgate•1mo ago

As far as I understand, hashbrown already does this. Hashbrown is based on Google's SwissTable, and this project references that SwissTable already does this optimisation.

conradludgate•1mo ago

To elaborate, hashbrown uses quadratic-ish probing over groups, each group can store 16 slots on sse2.

https://github.com/rust-lang/hashbrown/blob/master/src/contr...

https://github.com/rust-lang/hashbrown/blob/6efda58a30fe712a...

jeffbee•1mo ago

Static size, no deleting. Everyone already knew that you can make faster hash tables when they never need to be resized, but nobody bothers doing that because it is pretty useless or at best niche.

dragontamer•1mo ago

Well, not to be completely dismissive here... It's clearly a prototype project to try and make quadratic probing a thing.

I'm not convinced this methology is better than linear probing (which then can be optimized easily into RobinHood hashes).

The only line I see about linear hashes is:

> Linear jumps (h, h+16, h+32...) caused 42% insert failure rate due to probe sequence overlap. Quadratic jumps spread groups across the table, ensuring all slots are reachable.

Which just seems entirely erroneous to me. How can linear probing fail? Just keep jumping until you find an open spot. As long as there is at least one open spot, you'll find it in O(n) time because you're just scanning the whole table.

Linear probing has a clustering problem. But IIRC modern CPUs have these things called L1 Cache/locality, meaning scanning all those clusters is stupidly fast in practice.

jeffbee•1mo ago

The comments don't make sense to you because you know what you are talking about, claude does not, and this code was all written by claude.

dragontamer•1mo ago

Hmmm. That makes me sad but it does explain the uneasy feeling I got when reading the GitHub page

hinkley•1mo ago

Linear probing could get pretty nasty corner cases in a concurrent system. Particularly one where the table is “warmed up” at start so that 80% of the eventual size shows up in the first minute of use. If that table is big enough then pressure to increase the load factor will be high, leading to more probing.

If you have ten threads all probing at the same time then you could get priority inversion and have the first writer take the longest to insert. If they hit more than a couple collisions then writers who would collide with them end up taking their slots before they can scan them.

dragontamer•1mo ago

That's surely true of quadratic probing though?

hinkley•1mo ago

Cliff Click designed a hash table that does concurrent draining of the old table when resizing to a new one. I don’t think he did rate limiting on puts but there are other real time systems that amortize cleanup across all write allocations, which then spreads the cost in a way compatible with deadlines.

zX41ZdbW•1mo ago

The test does not look realistic: https://github.com/Cranot/grouped-simd-hashtable/blob/master...

Better to use a few distributions of keys from production-like datasets, e.g., from ClickBench. Most of them will be Zipfian and also have different temporal locality.

squirrellous•1mo ago

Not sure how much value there is in beating Swisstables in very particular cases like this. For specialized cases, one can beat Swisstables by more margin and less effort by using more memory and decreasing load factor, thereby decreasing collisions. You don’t even need SIMD in that case since collisions are rare.

nly•1mo ago

I'm pretty sure Boost.Unordered employs the same techniques.

> https://www.boost.org/doc/libs/latest/libs/unordered/doc/htm...

> When looking for an element with hash value h, SIMD technologies such as SSE2 and Neon allow us to very quickly inspect the full metadata word and look for the reduced value of h among all the 15 buckets with just a handful of CPU instructions: non-matching buckets can be readily discarded,

We Mourn Our Craft

Speed up responses with fast mode

Hoot: Scheme on WebAssembly

U.S. Jobs Disappear at Fastest January Pace Since Great Recession

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

Stories from 25 Years of Software Development

Al Lowe on model trains, funny deaths and working with Disney

The AI boom is causing shortages everywhere else

The Waymo World Model

Reinforcement Learning from Human Feedback

Start all of your commands with a comma (2009)

Vocal Guide – belt sing without killing yourself

France's homegrown open source online office suite

Coding agents have replaced every framework I used

Selection Rather Than Prediction

A Fresh Look at IBM 3270 Information Display System

72M Points of Interest

Unseen Footage of Atari Battlezone Arcade Cabinet Production

Where did all the starships go?

Software factories and the agentic moment

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Learning from context is harder than we thought

Monty: A minimal, secure Python interpreter written in Rust for use by AI

Making geo joins faster with H3 indexes

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

Hackers (1995) Animated Experience

Ga68, a GNU Algol 68 Compiler

Sheldon Brown's Bicycle Technical Info

An Update on Heroku

Show HN: If you lose your memory, how to regain access to your computer?

We Mourn Our Craft

Speed up responses with fast mode

Hoot: Scheme on WebAssembly

U.S. Jobs Disappear at Fastest January Pace Since Great Recession

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

Stories from 25 Years of Software Development

Al Lowe on model trains, funny deaths and working with Disney

The AI boom is causing shortages everywhere else

The Waymo World Model

Reinforcement Learning from Human Feedback

Start all of your commands with a comma (2009)

Vocal Guide – belt sing without killing yourself

France's homegrown open source online office suite

Coding agents have replaced every framework I used

Selection Rather Than Prediction

A Fresh Look at IBM 3270 Information Display System

72M Points of Interest

Unseen Footage of Atari Battlezone Arcade Cabinet Production

Where did all the starships go?

Software factories and the agentic moment

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Learning from context is harder than we thought

Monty: A minimal, secure Python interpreter written in Rust for use by AI

Making geo joins faster with H3 indexes

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

Hackers (1995) Animated Experience

Ga68, a GNU Algol 68 Compiler

Sheldon Brown's Bicycle Technical Info

An Update on Heroku

Show HN: If you lose your memory, how to regain access to your computer?

High-performance C++ hash table using grouped SIMD metadata scanning

Comments