frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Scaling HNSWs

https://antirez.com/news/156
81•cyndunlop•6h ago
https://en.wikipedia.org/wiki/Hierarchical_navigable_small_w...

Comments

softwaredoug•1h ago
At very high scale, there's less usage of graphs. Or there's a set of clustering on top of graphs.

Graphs can be complex to build and rebalance. Graph-like data structures with a thing, then a pointer out to another thing, aren't that cache friendly.

Add to that, people almost always want to *filter* vector search results. And this is a huge blindspot for consumers and providers. It's where the ugly performance surprises come from. Filtered HNSW isn't straightforward, and requires you to just keep traversing the graph looking for results that satisfy your filter.

HNSW came out of a benchmark regime where we just indexed some vectors and tried to only maximize recall for query latency. It doesn't take into account the filtering / indexing almost everyone wants.

Turbopuffer, for example, doesn't use graphs at all, it uses SPFresh. And they recently got 200ms latency on 100B vectors.

https://turbopuffer.com/docs/vector

curl-up•52m ago
I'm facing the problem you describe daily. It's especially bad because it's very difficult for me to predict if the set of filters will reduce the dataset by ~1% (in which case following the original vector index is fine) or by 99.99% (in which case you just want to brute force the remaining vectors).

Tried a million different things, but haven't heard of Turbopuffer yet. Any references on how they perform with such additional filters?

inertiatic•24m ago
Lucene and ES implement a shortcut for filters that are restrictive enough. Since it's already optimized for figuring out if something falls into your filter set, you first determine the size of that. You traverse the HNSW normally, then if you have traversed more nodes than your filter set's cardinality, you just switch to brute forcing your filter set distance comparisons. So worst case scenario is you do 2x your filter set size vector distance operations. Quite neat.
curl-up•20m ago
Oh that's nice! Any references on this shortcut? How do you activate that behavior? I was playing around with ES, but the only suggestion I found was to use `count` on filters before deciding (manually) which path to take.
inertiatic•15m ago
Here you go https://github.com/apache/lucene/pull/656 - no need to do anything from the user side to trigger it as far as I know.
spullara•24m ago
Hybrid search with vector similarity and filtering I think has mostly been solved by Vespa and not even recently.

https://blog.vespa.ai/vespa-hybrid-billion-scale-vector-sear...

softwaredoug•19m ago
For sure. But its "solved" differently by every vector database. You have to pay attention to how its solved.
cfors•4m ago
Just curious what the state of the art around filtered vector search results is? I took a quick look at the SPFresh paper and didn't see it specifically address filtering.
simonw•1h ago
This is well worth reading in full. The section about threading is particularly interesting: most of Redis is single-threaded, but antirez decided to use threads for the HNSW implementation and explains why.
dizzant•4m ago
> many programmers are smart, and if instead of creating a magic system they have no access to, you show them the data structure, the tradeoffs, they can build more things, and model their use cases in specific ways. And your system will be simpler, too.

Basically my entire full-time job is spent prosecuting this argument. It is indeed true that many programmers are smart, but it is equally true that many programmers _are not_ smart, and those programmers have to contribute too. More hands is usually better than simpler systems for reasons that have nothing to do with technical proficiency.

A Catalog of Side Effects

https://bernsteinbear.com/blog/compiler-effects/
20•speckx•45m ago•2 comments

Terminal Latency on Windows (2024)

https://chadaustin.me/2024/02/windows-terminal-latency/
59•bariumbitmap•2h ago•38 comments

Scaling HNSWs

https://antirez.com/news/156
83•cyndunlop•6h ago•11 comments

Cache-friendly, low-memory Lanczos algorithm in Rust

https://lukefleed.xyz/posts/cache-friendly-low-memory-lanczos/
74•lukefleed•3h ago•8 comments

We ran over 600 image generations to compare AI image models

https://latenitesoft.com/blog/evaluating-frontier-ai-image-generation-models/
50•kalleboo•3h ago•24 comments

Xortran - A PDP-11 Neural Network With Backpropagation in Fortran IV

https://github.com/dbrll/Xortran
5•rahen•19m ago•0 comments

A modern 35mm film scanner for home

https://www.soke.engineering/
7•QiuChuck•41m ago•2 comments

Pikaday: A friendly guide to front-end date pickers

https://pikaday.dbushell.com
45•mnemonet•5h ago•17 comments

Creating minimal music with code in any programming language

https://zserge.com/posts/etude-in-c/
20•etrvic•6d ago•2 comments

The history of Casio watches

https://www.casio.com/us/watches/50th/Heritage/1970s/
71•qainsights•2d ago•42 comments

Show HN: Cactoide – Federated RSVP Platform

https://cactoide.org/
38•orbanlevi•3h ago•15 comments

iPhone Pocket

https://www.apple.com/newsroom/2025/11/introducing-iphone-pocket-a-beautiful-way-to-wear-and-carr...
343•soheilpro•10h ago•902 comments

Weave (YC W25) is hiring a founding ML engineer

https://www.ycombinator.com/companies/weave-3/jobs/ZPyeXzM-founding-ml-engineer
1•adchurch•3h ago

FFmpeg to Google: Fund Us or Stop Sending Bugs

https://thenewstack.io/ffmpeg-to-google-fund-us-or-stop-sending-bugs/
223•CrankyBear•1h ago•149 comments

Show HN: Data Formulator – interactive AI agents for data analysis (Microsoft)

https://data-formulator.ai/
13•chenglong-hn•2h ago•6 comments

Firefox expands fingerprint protections

https://blog.mozilla.org/en/firefox/fingerprinting-protections/
178•ptrhvns•4h ago•104 comments

The AI Surveillance Dystopia: Spying, Data Trafficking, & Corruption

https://store.gamersnexus.net/ai-dystopia
7•Stevvo•43m ago•3 comments

How I fell in love with Erlang

https://boragonul.com/post/falling-in-love-with-erlang
326•asabil•1w ago•193 comments

The R47: A new physical RPN calculator

https://www.swissmicros.com/product/model-r47
126•dm319•4d ago•72 comments

Grebedoc – static site hosting for Git forges

https://grebedoc.dev
29•todsacerdoti•5h ago•4 comments

Drawing Text Isn't Simple: Benchmarking Console vs. Graphical Rendering

https://cv.co.hu/csabi/drawing-text-performance-graphical-vs-console.html
39•PaulHoule•5h ago•30 comments

Array Programming the Mandelbrot Set

https://jcmorrow.com/mandelbrot/
28•jcmorrow•4d ago•3 comments

Advent of Code on the Z-Machine

https://entropicthoughts.com/advent-of-code-on-z-machine
85•todsacerdoti•8h ago•17 comments

Why effort scales superlinearly with the perceived quality of creative work

https://markusstrasser.org/creative-work-landscapes.html
116•eatitraw•12h ago•96 comments

The 'Toy Story' You Remember

https://animationobsessive.substack.com/p/the-toy-story-you-remember
1060•ani_obsessive•17h ago•297 comments

The Perplexing Appeal of the Telepathy Tapes

https://asteriskmag.com/issues/12-books/paradigm-shifted-the-perplexing-appeal-of-the-telepathy-t...
48•surprisetalk•6h ago•46 comments

Show HN: Gametje – A casual online gaming platform

https://gametje.com
83•jmpavlec•5h ago•32 comments

DARPA and Texas Bet $1.4B on Unique Foundry -3D heterogeneous integration

https://spectrum.ieee.org/3d-heterogeneous-integration
62•pseudolus•8h ago•14 comments

Welcome, the entire land - "Hello, world!" in hieroglyphics (2009)

https://optional.is/required/2009/12/03/welcome-the-entire-land/
78•andrelaszlo•9h ago•29 comments

High speed X-ray video: jumping beans, wind-up toys and more

https://www.youtube.com/watch?v=xdpDd7dyU00
51•surprisetalk•4d ago•18 comments