frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Micro-Front Ends in 2026: Architecture Win or Enterprise Tax?

https://iocombats.com/blogs/micro-frontends-in-2026
1•ghazikhan205•1m ago•0 comments

Japanese rice is the most expensive in the world

https://www.cnn.com/2026/02/07/travel/this-is-the-worlds-most-expensive-rice-but-what-does-it-tas...
1•mooreds•1m ago•0 comments

These White-Collar Workers Actually Made the Switch to a Trade

https://www.wsj.com/lifestyle/careers/white-collar-mid-career-trades-caca4b5f
1•impish9208•1m ago•1 comments

The Wonder Drug That's Plaguing Sports

https://www.nytimes.com/2026/02/02/us/ostarine-olympics-doping.html
1•mooreds•1m ago•0 comments

Show HN: Which chef knife steels are good? Data from 540 Reddit tread

https://new.knife.day/blog/reddit-steel-sentiment-analysis
1•p-s-v•1m ago•0 comments

Federated Credential Management (FedCM)

https://ciamweekly.substack.com/p/federated-credential-management-fedcm
1•mooreds•2m ago•0 comments

Token-to-Credit Conversion: Avoiding Floating-Point Errors in AI Billing Systems

https://app.writtte.com/read/kZ8Kj6R
1•lasgawe•2m ago•1 comments

The Story of Heroku (2022)

https://leerob.com/heroku
1•tosh•2m ago•0 comments

Obey the Testing Goat

https://www.obeythetestinggoat.com/
1•mkl95•3m ago•0 comments

Claude Opus 4.6 extends LLM pareto frontier

https://michaelshi.me/pareto/
1•mikeshi42•4m ago•0 comments

Brute Force Colors (2022)

https://arnaud-carre.github.io/2022-12-30-amiga-ham/
1•erickhill•7m ago•0 comments

Google Translate apparently vulnerable to prompt injection

https://www.lesswrong.com/posts/tAh2keDNEEHMXvLvz/prompt-injection-in-google-translate-reveals-ba...
1•julkali•7m ago•0 comments

(Bsky thread) "This turns the maintainer into an unwitting vibe coder"

https://bsky.app/profile/fullmoon.id/post/3meadfaulhk2s
1•todsacerdoti•8m ago•0 comments

Software development is undergoing a Renaissance in front of our eyes

https://twitter.com/gdb/status/2019566641491963946
1•tosh•8m ago•0 comments

Can you beat ensloppification? I made a quiz for Wikipedia's Signs of AI Writing

https://tryward.app/aiquiz
1•bennydog224•9m ago•1 comments

Spec-Driven Design with Kiro: Lessons from Seddle

https://medium.com/@dustin_44710/spec-driven-design-with-kiro-lessons-from-seddle-9320ef18a61f
1•nslog•9m ago•0 comments

Agents need good developer experience too

https://modal.com/blog/agents-devex
1•birdculture•11m ago•0 comments

The Dark Factory

https://twitter.com/i/status/2020161285376082326
1•Ozzie_osman•11m ago•0 comments

Free data transfer out to internet when moving out of AWS (2024)

https://aws.amazon.com/blogs/aws/free-data-transfer-out-to-internet-when-moving-out-of-aws/
1•tosh•12m ago•0 comments

Interop 2025: A Year of Convergence

https://webkit.org/blog/17808/interop-2025-review/
1•alwillis•13m ago•0 comments

Prejudice Against Leprosy

https://text.npr.org/g-s1-108321
1•hi41•14m ago•0 comments

Slint: Cross Platform UI Library

https://slint.dev/
1•Palmik•18m ago•0 comments

AI and Education: Generative AI and the Future of Critical Thinking

https://www.youtube.com/watch?v=k7PvscqGD24
1•nyc111•18m ago•0 comments

Maple Mono: Smooth your coding flow

https://font.subf.dev/en/
1•signa11•19m ago•0 comments

Moltbook isn't real but it can still hurt you

https://12gramsofcarbon.com/p/tech-things-moltbook-isnt-real-but
1•theahura•23m ago•0 comments

Take Back the Em Dash–and Your Voice

https://spin.atomicobject.com/take-back-em-dash/
1•ingve•23m ago•0 comments

Show HN: 289x speedup over MLP using Spectral Graphs

https://zenodo.org/login/?next=%2Fme%2Fuploads%3Fq%3D%26f%3Dshared_with_me%25253Afalse%26l%3Dlist...
1•andrespi•24m ago•0 comments

Teaching Mathematics

https://www.karlin.mff.cuni.cz/~spurny/doc/articles/arnold.htm
2•samuel246•27m ago•0 comments

3D Printed Microfluidic Multiplexing [video]

https://www.youtube.com/watch?v=VZ2ZcOzLnGg
2•downboots•27m ago•0 comments

Abstractions Are in the Eye of the Beholder

https://software.rajivprab.com/2019/08/29/abstractions-are-in-the-eye-of-the-beholder/
2•whack•27m ago•0 comments
Open in hackernews

The Theoretical Limitations of Embedding-Based Retrieval

https://arxiv.org/abs/2508.21038
151•fzliu•5mo ago

Comments

gdiamos•5mo ago
Their idea is that capacity of even 4096-wide vectors limits their performance.

Sparse models like BM25 have a huge dimension and thus don’t suffer from this limit, but they don’t capture semantics and can’t follow instructions.

It seems like the holy grail is a sparse semantic model. I wonder how splade would do?

CuriouslyC•5mo ago
We already have "sparse" embeddings. Google's Matryoshka embedding schema can scale embeddings from ~150 dimensions to >3k, and it's the same embedding with layers of representational meaning. Imagine decomposing an embedding along principle components, then streaming the embedding vectors in order of their eigenvalue, kind of the idea.
jxmorris12•5mo ago
Matryoshka embeddings are not sparse. And SPLADE can scale to tens or hundreds of thousands of dimensions.
CuriouslyC•5mo ago
If you consider the actual latent space the full higher dimensional representation, and you take the first principle component, the other vectors are zero. Pretty sparse. No it's not a linked list sparse matrix. Don't be a pedant.
zwaps•5mo ago
No one means Matryoshka embeddings when they talk about sparse embeddings. This is not pedantic.
CuriouslyC•5mo ago
No one means wolves when they talk about dogs, obviously wolves and dogs are TOTALLY different things.
cap11235•5mo ago
Why?
yorwba•5mo ago
When you truncate Matryoshka embeddings, you get the storage benefits of low-dimensional vectors with the limited expressiveness of low-dimensional vectors. Usually, what people look for in sparse vectors is to combine the storage benefits of low-dimensional vectors with the expressiveness of high-dimensional vectors. For that, you need the non-zero dimensions to be different for different vectors.
faxipay349•5mo ago
Yeah, the standard SPLADE model trained from BERT typically already has a vocabulary/vector size of 30,552. If the SPLADE model is based on a multilingual version of BERT, such as mBERT or XLM-R, the vocabulary size could inherently expand to approximately 100,000, as does the vector size.
3abiton•5mo ago
Doesn't PCA compress the embeddings in this case, ie reduce the accuracy? It's similar to quantization.
CuriouslyC•5mo ago
Component analysis doesn't fundamentally reduce information, it just rotates it into a more informative basis. People usually drop vectors using the eigenvalues to do dimensionality reduction, but you don't have to do that.
miven•5mo ago
Correct me if I'm misinterpreting something in your argument but as I see it Matryoshka embeddings just sort the vector bases of the output space roughly by order of their importance for the task, PCA-style, so when you truncate your 4096-dimensionnal embedding down to a set of let's say 256 dimensions, those are the exact same 256 vector bases doing the core job of encoding important information for each sample, so you're back to dense retrieval on 256-dimensional vectors, just that all the minor miscellaneous slack useful for a very low fraction of queries has been trimmed away.

True sparsity would imply keeping different important vector bases for different documents, but MRL doesn't magically shuffle vector bases around depending on what's your document contains, were that the case cosine similarity between the resulting documents embeddings would simply make no sense as a similarity measure.

tkfoss•5mo ago
Wouldn't holy grail then be parallel channels for candidate generation;

  euclidean embedding
  hyperbolic embedding
  sparse BM25 / SPLADE lexical search
  optional multi-vector signatures

  ↓ merge & deduplicate candidates
followed by weight scoring, expansion (graph) & rerank (LLM)?
jdthedisciple•5mo ago
that is pretty much exactly what we do for our company-internal knowledge retrieval:

    embedding search (0.4)
    lexical/keyword search (0.4)
    fuzzy search (0.2)
might indeed achieve the best of all worlds
faxipay349•5mo ago
I just came across an evaluation of state-of-the-art SPLADE models. Yeah they utilize BERT's vocabulary size as their sparse vector dimensionality and do capture semantics. As expected, they significantly outperform all dense models in this benchmark. https://github.com/frinkleko/LIMIT-Sparse-Embedding OpenSearch team seemed has been working on inference-free versions of these models. Similar to BM25, these models only encode documents offline. So now we have sparse, small and efficient models while is much better than dense ones, at least on LIMIT
ArnavAgrawal03•5mo ago
we used multi-vector models at Morphik, and I can confirm the real-world effectiveness, especially when compared with dense-vector retrieval.
codingjaguar•5mo ago
Curious is that colbert-like ones?
bjornsing•5mo ago
Can you share more about the multi-vector approach? Is it open source?
Straw•5mo ago
In the theoretical section, they extrapolate assuming a polynomial from 40 to thousands of dimensions. Why do they trust a polynomial fit to extrapolate two orders of magnitude? Why do we even think it's polynomial instead of exponential in the first place? Most things like this increase exponentially with dimension.

In fact, I think we can do it in d=2k dimensions, if we're willing to have arbitrarily precise query vectors.

Embed our points as (sin(theta), cos(theta), sin(2 x theta), cos(2 x theta)..., sin(k x theta), cos(k x theta)), with theta uniformly spaced around the circle, and we should be able to select any k of them.

Using a few more dimensions we can then ease the precision requirements on the query.

namibj•5mo ago
In practice you're actually hitting further problems because you don't have those synthetic top-k tasks but rather open-domain documents and queries to support. And if you hope to get better than "just" having the top-k correct and instead get some sort of inclusion/exclusion boundary between what should be matched and what should not be matched, you'll hit the same bounds as apply to context length limitations for kq dimenionality in a transformer's attention layers, as I mentioned about 6 weeks ago: https://news.ycombinator.com/item?id=44570650
yorwba•5mo ago
I'm not following your construction. In the k=2 case, how do you construct your 4-dimensional query vector so that the dot product is maximized for two different angles theta and phi, but lower for any other arbitrary angle?
Straw•5mo ago
Viewing our space as two complex points instead of four real:

Let H(z) = (z- exp(i theta)) (z - exp(i phi))

H(z) is zero only for our two points. We'll now take the norm^2, to get something that's minimized on only our two chosen points.

|H(exp(i x))|^2 = H(z) x conj(H(z)) = sum from -2 to 2 of c_j x exp(i j x)

For some c_j (just multiply it out). Note that this is just a Fourier series with highest harmonic 2 (or k in general), so in fact our c_j tell us the four coefficients to use.

For a simpler, but numerically nastier construction, instead use (t, t^2, t^3, ...), the real moment curve. Then given k points, take the degree k poly with those zeros. Square it, negate, and we get a degree 2k polynomial that's maximized only on our selected k points. The coefficients of this poly are the coefficients of our query vector, as once expanded onto the moment curve we can evaluate polynomials by dot product with the coefficients.

The complex version is exactly the same idea but with z^k and z ranging on the unit circle instead.

lunarmony•5mo ago
Researchers have discussed limitations of vector-based retrieval from a rank perspective in various forms for a few years. It's further been shown that better alternative exists; some low-rank approaches can theoretically approximate arbitrary high-rank distribution while permitting MIPS-level efficient inference (see e.g., Retrieval with Learned Similarities, https://arxiv.org/abs/2407.15462). Such solutions are already being used in production at Meta and at LinkedIn.
yorwba•5mo ago
I don't think Mixture of Logits from the paper you link circumvents the theoretical limitations pointed out here, since their dataset size mostly stays well below the limit.

In the end they still rely on Maximum Inner Product Search, just with several lookups for smaller partitions of the full embedding, and the largest dataset is Books, where this paper suggests you'd need more than 512 embedding dimensions, and MoL with 256-dimensional embeddings split into 8 parts of 32 each has an abysmal hit rate.

So that's hardly a demonstration that arbitrary high-rank distributions can be approximated well. MoL seems to approximate it better than other approaches, but all of them are clearly hampered by the small embedding size.

lunarmony•5mo ago
Mixture of Logits was actually already deployed on 100M scale+ datasets at Meta and at LinkedIn (https://arxiv.org/abs/2306.04039 https://arxiv.org/abs/2407.13218 etc.). The crucial departure from traditional embedding/multi-embedding approaches is in learning a query-/item- dependent gating function, which enables MoL to become a universal high-rank approximator (assuming we care about recall@1) even when the input embeddings are low rank.
zwaps•5mo ago
How does it look with ColBert style late interaction embeddings?
yorwba•5mo ago
The dashed brown line in figures 3, 4, and 6 is labeled GTE-ModernColBERT. So better than all single-vector models, but worse than BM25.
mingtianzhang•5mo ago
We are always looking for representations that can capture the meaning of information. However, most representations that compress information for retrieval are lossy. For example, embeddings are a form of lossy compression. Similar to the no-free-lunch theorem, no lossy compression method is universally better than another, since downstream tasks may depend on the specific information that gets lost. Therefore, the question is not which representation is perfect, but which representation is better aligned with an AI system. Because AI evolves rapidly, it is difficult to predict the limitations of the next generation of LLMs. For this reason, a good representation for information retrieval in future LLM systems should be closer to how humans represent knowledge.

When a human tries to retrieve information in a library, they first locate a book by category or by using a metadata keyword search. Then, they open the table of contents (ToC) to find the relevant section, and repeat this process as needed. Therefore, I believe the future of AI retrieval systems should mimic this process. The recently popular PageIndex approach (see this discussioin: https://news.ycombinator.com/item?id=45036944) also belongs to this category, as it generates a table-of-contents–like tree for LLMs to reason over. Again, it is a form of lossy compression, so its limitations can be discussed. However, this approach is the closest to how humans perform retrieval.

mingtianzhang•5mo ago
A follow-up question is: what is the lossless way to represent knowledge? That would mean reading all the knowledge at once, which is the most accurate but also the least efficient method. Therefore, for different applications, we need to find an appropriate trade-off between accuracy and efficiency. In systems like real-time recommendation, we prefer efficiency over accuracy, so vector-based search is suitable. In domain-specific QA, we prefer accuracy over efficiency, so maybe a table-of-contents–based search may be the better choice.
mingtianzhang•5mo ago
It is also worth mentioning that compression and generative AI are two sides of the same coin. I highly recommend the book "Information Theory, Inference, and Learning Algorithms" by David MacKay, which explores these deep connections.
jandrewrogers•5mo ago
This is the subject of the Hutter Prize and the algorithmic information theory that underpins it. There are some hard algorithm and data structure problems underlying lossless approximations of general learning even for relatively closed domains.

As an example, current AI is famously very poor at learning relationships between non-scalar types, like complex geometry, which humans learn with ease. That isn’t too surprising because the same representation problem exists in non-AI computer science.

simne•5mo ago
Looks like, people use hybrid approach, as all these ToC, metadata, etc, are essentially semantic (executed on neural network), but just recognition of text and recognition of characters are neural.
mingtianzhang•5mo ago
Yeah, I guess another point is that with ToC, metadata representation is transparent—people know what information is lost. On the other hand, vector-based representation is a black box. Explainability and transparency are also important considerations in production-level AI systems.
anothernewdude•5mo ago
> Similar to the no-free-lunch theorem, no lossy compression method is universally better than another,

No lunch theorem only works, because they assume you care about every single value of noise. Nobody does. There's a free lunch to be had, and it's order. You don't care about a single pixel difference between two cat pictures, NFL does.

Lossy compression is precisely where NFL does not apply.

mingtianzhang•5mo ago
Just similar in theorem style, I try to emphasise that no lossy representation is universally (i.e. for all downstream tasks) better than another.
quadhome•5mo ago
Humans only retrieve information in a library in that way due to the past limitations on retrieval and processing. The invention of technologies like tables of contents or even the Dewey Decimal Classification are strongly constrained by fundamental technologies like ... the alphabet! And remember, not all languages are alphabetic. And embeddings aren't alphabetic and don't share the same constraints.

I recommend Judith Flanders' "A Place for Everything" as a both a history and survey of the constraints in sorting and organising information in an alphabetic language. It's also a fun read!

tl;dr why would we want an LLM do something as inefficiently as a human?

mingtianzhang•5mo ago
"why would we want an LLM do something as inefficiently as a human?" -- That is a good point. Maybe we should rename artificial intelligence (AI) to super-artificial intelligence (SAI).
simne•5mo ago
Could somebody suggest good introduction to simulation of complex behavior with neural networks?

I mean, I hear, about experiments of running Turing machine simulation on NN, or even simulation of some physics on NN, but I have not seen any good survey on these topics, and they could be very interest on subject.