frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Muvera: Making multi-vector retrieval as fast as single-vector search

https://research.google/blog/muvera-making-multi-vector-retrieval-as-fast-as-single-vector-search/
79•georgehill•11h ago

Comments

trengrj•7h ago
We added Muvera to Weaviate recently https://weaviate.io/blog/muvera and also have a nice podcast on it https://www.youtube.com/watch?v=nSW5g1H4zoU.

When looking at multi-vector / ColBERT style approaches, the embedding per token approach can massively increase costs. You might go from a single 768 dimension vector to 128 x 130 = 16,640 dimensions. Even with better results from a multi-vector model this can make it unfeasible for many use-cases.

Muvera, converts the multiple vectors into a single fixed dimension (usually net smaller) vector that can be used by any ANN index. As you now have a single vector you can use all your existing ANN algorithms and stack other quantization techniques for memory savings. In my opinion it is a much better approach than PLAID because it doesn't require specific index structures or clustering assumptions and can achieve lower latency.

dinobones•5h ago
So this is basically an “embedding of embeddings”, an approximation of multiple embeddings compressed into one, to reduce dimensionality/increase performance.

All this tells me is that: the “multiple embeddings” are probably mostly overlapping and the marginal value of each additional one is probably low, if you can represent them with a single embedding.

I don’t otherwise see how you can keep comparable performance without breaking information theory.

kevmo314•1h ago
> marginal value of each additional one is probably low

This is the point of the paper. Specifically, that single embedding vectors are sparse enough that you can compact more data from additional vectors together to improve retrieval performance.

bobosha•3h ago
how is this different from generating a feature hash of the embeddings i.e reduce from many to one embedding reduction? Could a UMAP or such technique be helpful in reducing to a single vector?
dinkdonkbell•3h ago
UMAP doesn't project values into the same coordinate space. While the abstract properties are the same between projections, where it projects it to in coordinate space won't be the same.
nighthawk454•2h ago
Seems to be a trend away from mean-pooling into a single embedding. But instead of dealing with an embedding per token (lots) you still want to reduce it some. This method seems to cluster token embeddings by random partitioning, mean pool for each partition, and concatenate the resulting into a fixed-length final embedding.

Essentially, full multi vector comparison is challenging performance wise. Tools and performance for single vectors are much better. To compromise, cluster into k chunks and concatenate. Then you can do k-vector comparison at once with single-vector tooling and performance.

Ultimately the fixed length vector comes from having a fixed number of partitions, so this is kind of just k-means style clustering of the token level embeddings.

Presumably a dynamic clustering of the tokens could be even better, though that would leave you with a variable number of embeddings per document.

Google DeepMind Releases AlphaGenome

https://deepmind.google/discover/blog/alphagenome-ai-for-better-understanding-the-genome/
278•i_love_limes•7h ago•79 comments

Launch HN: Issen (YC F24) – Personal AI language tutor

200•mariano54•7h ago•173 comments

Memory Safety Is Merely Table Stakes

https://www.usenix.org/publications/loginonline/memory-safety-merely-table-stakes
34•comradelion•2h ago•15 comments

Starcloud says 1 launch, $8M but ISS tech says 17 launches, $850M+

https://angadh.com/space-data-centers-1
19•angadh•1h ago•18 comments

Kea 3.0, our first LTS version

https://www.isc.org/blogs/kea-3-0/
9•conductor•1h ago•4 comments

A Review of Aerospike Nozzles: Current Trends in Aerospace Applications

https://www.mdpi.com/2226-4310/12/6/519
56•PaulHoule•6h ago•24 comments

Matrix v1.15

https://matrix.org/blog/2025/06/26/matrix-v1.15-release/
52•todsacerdoti•1h ago•10 comments

"Why is the Rust compiler so slow?"

https://sharnoff.io/blog/why-rust-compiler-slow
80•Bogdanp•2h ago•91 comments

Introducing Gemma 3n

https://developers.googleblog.com/en/introducing-gemma-3n-developer-guide/
216•bundie•4h ago•94 comments

The time is right for a DOM templating API

https://justinfagnani.com/2025/06/26/the-time-is-right-for-a-dom-templating-api/
19•mdhb•2h ago•4 comments

Show HN: I built an AI dataset generator

https://github.com/metabase/dataset-generator
100•matthewhefferon•6h ago•21 comments

SigNoz (YC W21, Open Source Datadog) Is Hiring DevRel Engineers (Remote)(US)

https://www.ycombinator.com/companies/signoz/jobs/cPaxcxt-devrel-engineer-remote-us-time-zones
1•pranay01•2h ago

Snow - Classic Macintosh emulator

https://snowemu.com/
171•ColinWright•12h ago•63 comments

Shifts in diatom and dinoflagellate biomass in the North Atlantic over 6 decades

https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0323675
23•PaulHoule•4h ago•1 comments

Robots that learn

https://openai.com/index/robots-that-learn/
24•ulrischa•1h ago•8 comments

A new pyramid-like shape always lands the same side up

https://www.quantamagazine.org/a-new-pyramid-like-shape-always-lands-the-same-side-up-20250625/
598•robinhouston•1d ago•145 comments

Low Overhead Allocation Sampling in a Garbage Collected Virtual Machine

https://arxiv.org/abs/2506.16883
7•matt_d•3d ago•1 comments

Puerto Rico's Solar Microgrids Beat Blackout

https://spectrum.ieee.org/puerto-rico-solar-microgrids
324•ohjeez•21h ago•185 comments

Typr – TUI typing test with a word selection algorithm inspired by keybr

https://github.com/Sakura-sx/typr
26•Sakura-sx•3d ago•3 comments

Alternative Layout System

https://alternativelayoutsystem.com/scripts/#same-sizer
6•smartmic•2h ago•0 comments

Lateralized sleeping positions in domestic cats

https://www.cell.com/current-biology/fulltext/S0960-9822(25)00507-X?_returnURL=https%3A%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS096098222500507X%3Fshowall%3Dtrue
62•EvgeniyZh•2h ago•25 comments

-2000 Lines of code (2004)

https://www.folklore.org/Negative_2000_Lines_Of_Code.html
512•xeonmc•1d ago•221 comments

The Business of Betting on Catastrophe

https://thereader.mitpress.mit.edu/the-business-of-betting-on-catastrophe/
52•anarbadalov•3d ago•23 comments

Show HN: Magnitude – open-source AI browser automation framework

https://github.com/magnitudedev/magnitude
19•anerli•3h ago•7 comments

US economy shrank 0.5% in the first quarter, worse than earlier estimates

https://apnews.com/article/economy-tariffs-trump-gdp-shrink-86d1f15e66c646ac4ce88ffc0a956942
187•Aloisius•2h ago•57 comments

Muvera: Making multi-vector retrieval as fast as single-vector search

https://research.google/blog/muvera-making-multi-vector-retrieval-as-fast-as-single-vector-search/
79•georgehill•11h ago•6 comments

FLUX.1 Kontext [Dev] – Open Weights for Image Editing

https://bfl.ai/announcements/flux-1-kontext-dev
107•minimaxir•6h ago•29 comments

Access BMC UART on Supermicro X11SSH

https://github.com/zarhus/zarhusbmc/discussions/3
50•pietrushnic•6h ago•8 comments

A.I. Is Homogenizing Our Thoughts

https://www.newyorker.com/culture/infinite-scroll/ai-is-homogenizing-our-thoughts
41•thoughtpeddler•44m ago•25 comments

Ambient Garden

https://ambient.garden
268•fipar•3d ago•51 comments