During vector retrieval, we retrieve documents in sublinear time from a vector index. This allows us to reduce the number of documents from potentially billions to a much smaller number. The purpose of re-ranking is to allow high powered models to evaluate docs much more closely.
It is true that we can attempt to distill that reranking signal into a vector index. Most search engines already do this. But there is no replacement for using the high powered behavior based models in reranking.
> "The real challenge in traditional vector search isn't just poor re-ranking; it's weak initial retrieval. If the first layer of results misses the right signals, no amount of re-sorting will fix it. That's where Superlinked changes the game."
Currently a lot of RAG pipelines use the BM25 algorithm for retrieval, which is very good. You then use an agent to rerank stuff only after you've got your top 5-25 results, which is not that slow or expensive, if you've done a good job with your chunking. Using metadata is also not really a 'new' approach (well, in LLM time at least) - it's more about what metadata you use and how you use them.
Treating BM25 as a silver bullet is just as strange as treating vector search as the "true way" to solve retrieval.
The reason we don’t use the most powerful models on thousands/millions of candidates is because of latency, not quality. It’s the same reason we use ANN search rather than cosine sim for every doc in the index.
At retrieval time, our approach involves a broad "prefetching" step: we quickly identify the most relevant schemas, perform targeted vector searches within these schemas, and then rerank the top results using the LLM before agentic reasoning and execution. The LLM is provided with carefully pre-selected tools and fields, empowering it to dive deeper into prefetched results or explore alternate queries dynamically. This method significantly boosts RAG pipeline performance, ensuring both speed and relevance.
Additionally, by limiting visibility of the "agentic execution context" to just the current operation span and collapsing it in subsequent interactions, we keep context sizes manageable, further enhancing responsiveness and scalability.
This statement ^ is clearly incorrect on its premise -semantic meaning is already vectorized, and the problems with that are old news and have little to do w indexing.
I went through the article though, and realized the company is probably on its last legs - an effort that was interesting 2 years ago for about a week, but funded by non-developers without any gauge of reality.
petesergeant•5h ago
I read as much of this article as I could be bothered to and still didn’t really understand how it removes the need for reranking. It starts talking about mixing vector and non-vector search, so ok fine. Is there any signal here or is it all marketing fluff?
dev_l1x_be•5h ago
They achieve this with few different ways:
- Unified Multimodal Vectors (Mixing Data Types from the Start)
Instead of just creating a vector from the text description, Superlinked creates a single, richer vector for each item (e.g., a pair of headphones) right when it's indexed. This "multimodal vector" already encodes not just the text's meaning, but also its numerical attributes (like price, rating, battery life) and categorical attributes (like "electronics," "on-ear").
- Dynamic Query-Time Weighting (Telling the Search What Matters Now)
When you make a query, you can tell Superlinked how important each of those "baked-in" aspects of the multimodal vector is for that specific search. For example: "Find affordable wireless headphones under $200 and high ratings" – you can weight the "price" aspect heavily (to favor lower prices), the "rating" aspect heavily, and the "text similarity" to "wireless headphones" also significantly, all within the initial query to the unified vector.
- Hard Filtering Before Vector Search (Cutting Out Irrelevant Items Early)
You apply these hard filters (like price <= 200 or category == "electronics") before the vector similarity search even happens on the remaining items.
If these are implemented well, Superlinked could improve the quality of initial retrieval to a point where a separate re-ranking stage becomes less necessary.
Does this answer your question?