Show HN: C99 implementation of new O(m log^(2/3) n) shortest path algorithm

70•danalec•2h ago

Comments

danalec•2h ago

I implemented the STOC 2025 Best Paper Award winner in C99. The algorithm achieves O(m log^(2/3) n) complexity, breaking the 65-year-old sorting barrier for sparse directed SSSP that Dijkstra established at O(m + n log n).

*Key implementation details:* - Cache-optimized CSR layout for maximum spatial locality - Zero-allocation design with pre-allocated workspaces - Recursive subproblem decomposition instead of global priority queues

*Benchmarks:* On graphs with 250k–1M nodes, this shows 20,000x+ speedups over standard Dijkstra implementations. The DMMSY core runs in ~800ns for 1M nodes.

*Paper:* https://arxiv.org/pdf/2504.17033 *GitHub:* https://github.com/danalec/DMMSY-SSSP

This is experimental—focused on correctness first. Would love feedback on edge cases, better decomposition strategies, or cache-oblivious optimizations I might have missed.

crdrost•2h ago

Hopefully this does well just the way it is, but you might also want to tag the title with "Show HN:" next time? It gets you featured on a separate part of Hacker News that some of us like to visit.

reactordev•1h ago

And index…

reactordev•1h ago

I wish I was 20 years younger, and had 20 more years to play with all the amazing things that are going to come. That, or we blow ourselves up. Either way, we have room for improvement in the foundation of the house and this is fantastic. The more we push, the more we discover, the more we learn, the more we implement, the more we improve.

This will have downstream impacts for sure. Keep doing stuff like this.

newobj•1h ago

I’m confused. Doesn’t n need to be astronomically large for log^2/3(x) to outperform log(x) by even like 10x?

hephios•1h ago

The 20,000x speedup claim doesn't pass a basic sanity check.

The theoretical improvement of DMMSY over Dijkstra is O(log^{2/3} n) vs O(log n). For n = 1M, that's a ratio of ~2.7x. To get even a 10x improvement from asymptotics alone, you'd need n ≈ 2^{1000}, far beyond any graph that fits in memory (or in the observable universe, for that matter). The ~800ns figure for 1M nodes also seems physically implausible. Likely the benchmark is comparing an optimized hot path against a naive Dijkstra, or measuring something other than what it claims.

AnotherGoodName•1h ago

Note that we search graphs that don’t fit in memory all the time. Eg. Looking ahead at chess moves. We just calculate the graph on the fly. It’s really not that out of line to search down 1000 branching binary decisions (in chess this is only a few moves ahead).

Retr0id•1h ago

A possible explanation for the difference is that the Dijkstra impl being tested is doing multiple heap allocations at every step, which doesn't seem like a fair comparison.

Secondarily I believe in the case of the "opt" version, the compiler might be optimising out the entire algorithm because it has no observable side-effects.

hnfong•1h ago

If you look carefully at the graph on the readme page, you'd see it compares Dijkstra's algorithm, a "DMMSY(res)", and a "DMMSY(opt)".

Presumably the claimed contribution was the optimized version, but note that whatever DMMSY(res) is, it should still be O(m log^{2/3} n) or whatever DMMSY's time complexity is supposed to be.

But DMMSY(res)'s run times follow Dijkstra closely in the graph.

The only conclusion is that something is off -- either the author measured the wrong thing, or he was able to optimize the implementation of the algorithm to the extent that the optimizations over-powers any asymptotic gains. Or the implementation is wrong.

At any rate, as you mentioned, the difference between `m log n` and `m log^{2/3} n` is insignificant at n=1M (and TBH won't be significant at any practical n). If the results and the implementation are both correct, it just means the author simply found a way to reduce the constant time factor of the implementation.

AnotherGoodName•1h ago

mLog^(x/3) complexity intrigues me because i’ve seen it before in unrelated contexts (with different values for m but regardless..). Another example is the beat known integer factorisation and related discrete logarithms algorithms.

https://en.wikipedia.org/wiki/General_number_field_sieve

It’s such an out of nowhere part of complexity statements (what’s special about ‘x/3’?!) that i have to wonder if there’s some underlying relation.

Integer factorisation and discrete logarithm solving do look a lot like a type of search.

I feel it’s possibly similar to the moonshine theory where different fields kept producing the same (or 1 off from the same) out of nowhere number and for the longest time no one saw the link between them.

WalterGR•1h ago

> moonshine theory

Interesting - I hadn’t heard that term. Wikipedia link for anyone curious: https://en.wikipedia.org/wiki/Monstrous_moonshine

The term was coined by John Conway of Conway’s Game of Life.

qsort•1h ago

This is a valuable contribution, but some of the details make me raise my eyebrow, so forgive me if I'm going to put on my reviewer #2 hat.

- You did this with AI. I'm not going to dock points for this, it's $current_year and it's the normal thing to do, but it would be nice to explain your process a little bit? To put it bluntly, why does this improve upon typing "hey claude, read this paper and go nuts" into the proverbial text box?

- As others are pointing out, the 20,000x figure is quite something. For "normal sized" graphs we'd expect this algorithm to be actually slower, why aren't we seeing that? Could you explain how you tested your code?

- Why aren't you benchmarking against other established libraries? It seems like the Dijkstra implementation you claim to be improving upon is also your own.

Choco31415•1h ago

What makes you know this is AI generated? I’m not seeing any obvious signs at first glance.

Retr0id•1h ago

The most obvious thing to me is the style of the HTML graphs

qsort•1h ago

The readme is outright slop, the HTML chart has the classic "tailwind dark theme" layout that models default to absent specific instructions, many of the comments in the code are classic AI.

Didn't have time to read the code more in depth.

Tiberium•1h ago

The OP's comment to the post is clearly Markdown-formatted, real humans don't write like that on HN.

The readme is very obviously Claude-written (or a similar model - certainly not GPT), if you check enough vibecoded projects you'll easily spot those readmes.

The style of the HTML page, as noted by others.

Useless comments in the source code, which humans also do, but LLMs do more often:

// Basic random double

static inline double rand_double() { return (double)rand() / (double)RAND_MAX; }

meindnoch•1h ago

Slop.

verdverm•48m ago

I've been trying to use ns;nt more, but I'm not sure this one even qualifies for that

anematode•1h ago

It's slop: the numbers are fabricated.

   gcc -O3 -march=native -flto -DNDEBUG src/*.c -I include -o dmmsy_bench -lm

   ./a.out

   ===========================================================================
   n          m          Dijkstra     DMMSY Res    DMMSY Opt    Spd(Opt)
   ---------------------------------------------------------------------------
   1000       5000       0.0153       0.0171       0.0172       0.89    x
   5000       25000      0.0006       0.0001       0.0000       16.26   x
   10000      50000      1.0649       1.0729       1.0791       0.99    x
   25000      125000     3.1196       3.1592       3.1291       1.00    x
   50000      250000     6.9091       7.0232       7.0154       0.98    x
   75000      375000     10.7737      10.9603      10.9621      0.98    x
   100000     500000     14.9982      14.8546      14.8742      1.01    x
   150000     750000     23.3425      23.8628      23.8416      0.98    x
   200000     1000000    33.0851      33.8681      34.0710      0.97    x
   250000     1250000    43.5518      45.6267      46.4157      0.94    x
   350000     1750000    73.1518      75.8528      74.6965      0.98    x
   500000     2500000    125.9957     130.0651     129.2500     0.97    x

Retr0id•1h ago

While it is definitely slop, I think the numbers may be "real" but compiler-dependent. The 20000x "speedup" presumably only happens when the compiler detects that it can optimize the whole algorithm into a nop, because it has no observable side effects. (I have not tested this hypothesis)

anematode•40m ago

Maybe, but I think the OP is submitting this in bad faith (or got utterly bamboozled by the AI). I tried with recent clang and the specified flags, and the behavior is the same.

(I think it's unlikely that it can be straight-up optimized out because it dirties the workspace thread local, and compilers generally don't optimize out writes to thread local variables.)

lacoolj•1h ago

You can tell it's AI-gen just by the image of the graph/table comparison

Apparently all the LLMs get their design expertise and color palletes from the same place.

dgacmu•1h ago

The performance results seem a little, uh, disingenuous?

First, using a 4ary heap means the performance is closer to O((E+V)logV) instead of the E + VlogV you can get using a Fibonacci heap. So it removes the drawback to the new method, which is that it goes from VlogV to ELog2/3 V.

If you look at that term, it means the new method is only asymptotically faster when log1/3(n) > m/n, IF you're comparing to the actual VlogV version of Dijkstra's...

For m/n = 3, that cutover is at about 140M nodes.

Second, the "opt" seems to be applied only to the new method. There's no apples to apples comparison here.

The Age Verification Trap: Verifying age undermines everyone's data protection

Ladybird Browser adopts Rust

What it means that Ubuntu is using Rust

'Viking' was a job description, not a matter of heredity: Ancient DNA study

A simple web we own

Show HN: PgDog – Scale Postgres without changing the app

Elsevier shuts down its finance journal citation cartel

Show HN: Sowbot – open-hardware agricultural robot (ROS2, RTK GPS)

Hadrius (YC W23) Is Hiring Designers Who Code

The peculiar case of Japanese web design (2022)

Sub-$200 Lidar could reshuffle auto sensor economics

Magical Mushroom – Europe's first industrial-scale mycelium packaging producer

Anthropic Education the AI Fluency Index

The Lighthouse: How extreme isolation transforms the body and mind

0 A.D. Release 28: Boiorix

Emulating Goto in Scheme with Continuations

femtolisp: A lightweight, robust, scheme-like Lisp implementation

Large study finds link between cannabis use in teens and psychosis later

AI is destroying open source, and it's not even good yet [video]

Benchmarks for concurrent hash map implementations in Go

SETI@home: Data Acquisition and Front-End Processing (2025)

Decided to fly to the US to buy some hard drives

Show HN: AI Timeline – 171 LLMs from Transformer (2017) to GPT-5.3 (2026)

What Is a Centipawn Advantage?

My journey to the microwave alternate timeline

I built Timeframe, our family e-paper dashboard

Ed's Stratego Site

Pope tells priests to use their brains, not AI, to write homilies

VTT Test Donut Lab Battery Reaches 80% Charge in Under 10 Minutes [pdf]

Microspeak: Escrow

Show HN: C99 implementation of new O(m log^(2/3) n) shortest path algorithm

Comments

The Age Verification Trap: Verifying age undermines everyone's data protection

Ladybird Browser adopts Rust

What it means that Ubuntu is using Rust

'Viking' was a job description, not a matter of heredity: Ancient DNA study

A simple web we own

Show HN: PgDog – Scale Postgres without changing the app

Elsevier shuts down its finance journal citation cartel

Show HN: Sowbot – open-hardware agricultural robot (ROS2, RTK GPS)

Hadrius (YC W23) Is Hiring Designers Who Code

The peculiar case of Japanese web design (2022)

Sub-$200 Lidar could reshuffle auto sensor economics

Magical Mushroom – Europe's first industrial-scale mycelium packaging producer

Anthropic Education the AI Fluency Index

The Lighthouse: How extreme isolation transforms the body and mind

0 A.D. Release 28: Boiorix

Emulating Goto in Scheme with Continuations

femtolisp: A lightweight, robust, scheme-like Lisp implementation

Large study finds link between cannabis use in teens and psychosis later

AI is destroying open source, and it's not even good yet [video]

Benchmarks for concurrent hash map implementations in Go

SETI@home: Data Acquisition and Front-End Processing (2025)

Decided to fly to the US to buy some hard drives

Show HN: AI Timeline – 171 LLMs from Transformer (2017) to GPT-5.3 (2026)

What Is a Centipawn Advantage?

My journey to the microwave alternate timeline

I built Timeframe, our family e-paper dashboard

Ed's Stratego Site

Pope tells priests to use their brains, not AI, to write homilies

VTT Test Donut Lab Battery Reaches 80% Charge in Under 10 Minutes [pdf]

Microspeak: Escrow