frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Hashed sorting is typically faster than hash tables

https://reiner.org/hashed-sorting
104•Bogdanp•3d ago

Comments

scrubs•2d ago
Great article. Informative and well written.
tialaramex•2h ago
Though big enough to be worthwhile at scale, it's notable how small, relatively, the difference is from the out-of-box Rust unstable sort to the tuned radix sort.

Lukas Bergdoll and Orson Peters did some really important heavy lifting to get this stuff from "If you're an expert you could probably tune a general purpose sort to have these nice properties" to "Rust's general purpose sort has these properties out of the box" and that's going to make a real difference to a lot of software that would never get hand tuned.

nasretdinov•1h ago
I imagine it must make a bigger difference in languages where allocations are more costly too. E.g. in Go sorting to count uniques is most certainly much faster than using a map + it saves memory too
aDyslecticCrow•1h ago
This doesnt concerns in-language allocations at all. There is only one large allocation for both, done up-front.

The slowdowm has to do with cashe misses and cpu cashe capacity, which is optimisations the cpu does when executing.

Granted, a language like go may have more of the cpu cashe used up by different runtime checks.

Basically, i think this analysis largely language agnostic.

markisus•1h ago
Interesting article. It’s actually very strange that the dataset needs to be “big” for the O(n log n) algorithm to beat the O(n). Usually you’d expect the big O analysis to be “wrong” for small datasets.

I expect that in this case, like in all cases, as the datasets become gallactically large, the O(n) algorithm will start winning again.

Anon_troll•1h ago
The hash-based algorithm is only O(n) because the entry size has a limit. In a more general case, it would be something more like O(m(n * e)). Here n is the number of entries, e is the maximum entry size and m is a function describing how caching and other details affect the computation. With small enough data, the hash is very fast due to CPU caches, even if it takes more steps, as the time taken by a step is smaller. The article explains this topic in a less handwavey manner.

Also, memory access is constant time only to some upper limit allowed by the hardware, which requires significant changes to the implementation when the data does not fit the system memory. So, the hash algorithm will not stay O(n) once you go past the available memory.

The sorting algorithms do not suffer from these complexities quite as much, and similar approaches can be used with data sets that do not fit a single system's memory. The sorting-based algorithms will likely win in the galactically large cases.

Edit: Also, once the hash table would need to grow beyond what the hash function can describe (e.g. beyond 64 bit integers), you need to grow the function's data type. This is essentially a hidden log(n) factor, as the required length of the data type is log(n) of the maximum data size.

shiandow•40m ago
Interestingly you need a hash function big enough to be unique for all data points with high probability, it doesn't take much to point out that this is at least O(log(n)) if all items are unique.

Also if items take up k bytes then the hash must typically be O(k), and both the hashing and radix sort are O(n k).

Really radix sort should be considered O(N) where N is the total amount of data in bytes. It can beat the theoretical limit because it sorts lexicographically, which is not always an option.

aDyslecticCrow•1h ago
That is what baffles me. The difference in big O complexity should be more visible with size, but thats where it looses to the "worse" algorithm.

I could imagine the hash table wins again beyond a even greater threshold. Like what about 120GB and beyond?

shoo•45m ago
The analysis in the "Why does sorting win?" section of this article gave us a way to estimate that threshold.

Here's my attempt at it, based on that analysis:

Suppose each item key requires s bytes

For the hash table, assuming s <= 64, our cache line size, then we need to read one cache line and write one cache line.

The bandwidth to sort one key is p(N) * 2 * s where p(N) is the number of passes of 1024-bucket radix sort required to sort N elements, and 2 comes from 1 read + 1 write per 1024-bucket radix sort pass

p(N) = ceil(log2(N) / log2(buckets))

Suppose N is the max number of items we can distinguish with an s=8 byte item key, so N = 2^64

then p(N) = ceil(64 / 10) = 7 passes

7 radix passes * (1 read + 1 write) * 8 bytes per item key = 112 bytes of bandwidth consumed, this is still less than the bandwidth of the hash table 2 * 64.

We haven't found the threshold yet.

We need to either increase the item count N or increase the item key size s beyond 8 bytes to find a threshold where this analysis estimates that the hash map uses less bandwidth. But we cant increase the item count N without first increasing the key size s. Suppose we just increase the key size and leave N as-is.

Assuming N=2^64 items and an item size of b=10 bytes gives an estimate of 140 bytes of bandwidth for sorting vs 128 bytes of bandwidth. we expect sorting to be slower for these values, and increasing b or N further should make it even worse.

(the bandwidth required for hash map hasnt increased as our 10 byte b is still less than the size of a cache line)

edit: above is wrong as i'm getting mixed up with elements vs items. elements are not required to be unique items. if elements are unique items then the problem is trivial. So N is not bounded by the key size, and we can increase N beyond 2^64 without increasing the key size s beyond 8 bytes.

keeping key size s fixed at 8 bytes per item suggests the threshold is at N=2^80 items

given by solving for N for the threshold where bandwidth estimate for sort = bandwidth estimate for hash table

2 * 8 * (log2(N) / log2(1024)) = 2 * 64

Ar-Curunir•27m ago
why is it strange? the case where asymptotically better = concretely worse is usually the exception, not the norm.
simiones•5m ago
We expect asymptotically better algorithms to be, well, asymptotically better. Even if they lose out for small N, they should win for larger N - and we typically expect this "larger N" to not be that large - just big enough so that data doesn't fit in any cache or something like that.

However, in this particular case, the higher-complexity sorting algorithms gets better than the hash algorithm as N gets larger, even up to pretty large values of N. This is the counter-intuitive part. No one is surprised if an O(n log n) algorithm beats an O(n) algorithm for n = 10. But if the O(n log n) algorithm wins for n = 10 000 000, that's quite surprising.

smallpipe•1h ago
Could you reduce the BW requirement of the hash table by using an atomic increment instruction instead of read-modify-write ? It still performs a read modify write, but further down in the hierarchy, where more parallelism is available.
tgv•3m ago
Isn't radix sorting somewhat like hashing?

Reshaped is now open source

https://reshaped.so/blog/reshaped-oss
75•michaelmior•2h ago•11 comments

DeepCodeBench: Real-World Codebase Understanding by Q&A Benchmarking

https://www.qodo.ai/blog/deepcodebench-real-world-codebase-understanding-by-qa-benchmarking/
30•blazercohen•2h ago•2 comments

KDE launches its own distribution

https://lwn.net/SubscriberLink/1037166/caa6979c16a99c9e/
547•Bogdanp•14h ago•356 comments

Germany is not supporting ChatControl – blocking minority secured

https://digitalcourage.social/@echo_pbreyer/115184350819592476
651•xyzal•3h ago•173 comments

Show HN: Term.everything – Run any GUI app in the terminal

https://github.com/mmulet/term.everything
938•mmulet•1d ago•128 comments

DOOMscrolling: The Game

https://ironicsans.ghost.io/doomscrolling-the-game/
328•jfil•13h ago•77 comments

Piramidal (YC W24) Is Hiring Back End Engineer

https://www.ycombinator.com/companies/piramidal/jobs/1HvdaXs-full-stack-engineer-platform
1•dsacellarius•26m ago

Removing yellow stains from fabric with blue light

https://phys.org/news/2025-09-yellow-fabric-blue.html
58•bookofjoe•3d ago•39 comments

PgEdge Goes Open Source

https://www.pgedge.com/blog/pgedge-goes-open-source
17•Bogdanp•4h ago•2 comments

ChatGPT Developer Mode: Full MCP client access

https://platform.openai.com/docs/guides/developer-mode
467•meetpateltech•20h ago•253 comments

C++20 Modules: Practical Insights, Status and TODOs

https://chuanqixu9.github.io/c++/2025/08/14/C++20-Modules.en.html
9•ashvardanian•3d ago•2 comments

Hashed sorting is typically faster than hash tables

https://reiner.org/hashed-sorting
104•Bogdanp•3d ago•13 comments

Where did the Smurfs get their hats (2018)

https://www.pipelinecomics.com/beginning-bd-smurfs-hats-origin/
91•andsoitis•11h ago•35 comments

Court rejects Verizon claim that selling location data without consent is legal

https://arstechnica.com/tech-policy/2025/09/court-rejects-verizon-claim-that-selling-location-dat...
467•nobody9999•10h ago•53 comments

Pure and Impure Software Engineering

https://www.seangoedecke.com/pure-and-impure-engineering/
31•colonCapitalDee•3d ago•13 comments

How the tz database works (2020)

https://yatsushi.com/blog/tz-database/
23•jumbosushi•3d ago•3 comments

A desktop environment without graphics (tmux-like)

https://github.com/Julien-cpsn/desktop-tui
111•mustaphah•3d ago•33 comments

The HackberryPi CM5 handheld computer

https://github.com/ZitaoTech/HackberryPiCM5
216•kristianpaul•2d ago•75 comments

Jiratui – A Textual UI for interacting with Atlassian Jira from your shell

https://jiratui.sh/
258•gjvc•21h ago•66 comments

Intel's E2200 "Mount Morgan" IPU at Hot Chips 2025

https://chipsandcheese.com/p/intels-e2200-mount-morgan-ipu-at
76•ingve•14h ago•29 comments

Defeating Nondeterminism in LLM Inference

https://thinkingmachines.ai/blog/defeating-nondeterminism-in-llm-inference/
274•jxmorris12•19h ago•114 comments

Rewriting Dataframes for MicroHaskell

https://mchav.github.io/rewriting-dataframes-for-microhs/
45•internet_points•3d ago•3 comments

Launch HN: Recall.ai (YC W20) – API for meeting recordings and transcripts

88•davidgu•20h ago•44 comments

AI's $344B 'Language Model' Bet Looks Fragile

https://www.bloomberg.com/opinion/articles/2025-09-11/ai-s-344-billion-language-model-bet-looks-f...
4•thm•37m ago•1 comments

Brussels faces privacy crossroads over encryption backdoors

https://www.theregister.com/2025/09/11/eu_chat_control/
5•jjgreen•38m ago•0 comments

“No Tax on Tips” Includes Digital Creators, Too

https://www.hollywoodreporter.com/business/business-news/no-tax-on-tips-guidance-creators-trump-t...
146•aspenmayer•20h ago•242 comments

Pontevedra, Spain declares its entire urban area a "reduced traffic zone"

https://www.greeneuropeanjournal.eu/made-for-people-not-cars-reclaiming-european-cities/
826•robtherobber•1d ago•900 comments

Hot Chips 2025: Session 1 – CPUs

https://chipsandcheese.com/p/hot-chips-2025-session-1-cpus
27•rbanffy•3d ago•1 comments

Clojure's Solutions to the Expression Problem

https://www.infoq.com/presentations/Clojure-Expression-Problem/
138•adityaathalye•3d ago•15 comments

Fraudulent Publishing in the Mathematical Sciences

https://arxiv.org/abs/2509.07257
73•bikenaga•15h ago•37 comments