frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Rotten Tomatoes Desperately Claims 'Impossible' Rating for 'Melania' Is Real

https://www.thedailybeast.com/obsessed/rotten-tomatoes-desperately-claims-impossible-rating-for-m...
1•juujian•56s ago•0 comments

The protein denitrosylase SCoR2 regulates lipogenesis and fat storage [pdf]

https://www.science.org/doi/10.1126/scisignal.adv0660
1•thunderbong•2m ago•0 comments

Los Alamos Primer

https://blog.szczepan.org/blog/los-alamos-primer/
1•alkyon•4m ago•0 comments

NewASM Virtual Machine

https://github.com/bracesoftware/newasm
1•DEntisT_•7m ago•0 comments

Terminal-Bench 2.0 Leaderboard

https://www.tbench.ai/leaderboard/terminal-bench/2.0
1•tosh•7m ago•0 comments

I vibe coded a BBS bank with a real working ledger

https://mini-ledger.exe.xyz/
1•simonvc•7m ago•1 comments

The Path to Mojo 1.0

https://www.modular.com/blog/the-path-to-mojo-1-0
1•tosh•10m ago•0 comments

Show HN: I'm 75, building an OSS Virtual Protest Protocol for digital activism

https://github.com/voice-of-japan/Virtual-Protest-Protocol/blob/main/README.md
4•sakanakana00•13m ago•0 comments

Show HN: I built Divvy to split restaurant bills from a photo

https://divvyai.app/
3•pieterdy•16m ago•0 comments

Hot Reloading in Rust? Subsecond and Dioxus to the Rescue

https://codethoughts.io/posts/2026-02-07-rust-hot-reloading/
3•Tehnix•16m ago•1 comments

Skim – vibe review your PRs

https://github.com/Haizzz/skim
2•haizzz•18m ago•1 comments

Show HN: Open-source AI assistant for interview reasoning

https://github.com/evinjohnn/natively-cluely-ai-assistant
4•Nive11•18m ago•6 comments

Tech Edge: A Living Playbook for America's Technology Long Game

https://csis-website-prod.s3.amazonaws.com/s3fs-public/2026-01/260120_EST_Tech_Edge_0.pdf?Version...
2•hunglee2•22m ago•0 comments

Golden Cross vs. Death Cross: Crypto Trading Guide

https://chartscout.io/golden-cross-vs-death-cross-crypto-trading-guide
2•chartscout•24m ago•0 comments

Hoot: Scheme on WebAssembly

https://www.spritely.institute/hoot/
3•AlexeyBrin•27m ago•0 comments

What the longevity experts don't tell you

https://machielreyneke.com/blog/longevity-lessons/
2•machielrey•28m ago•1 comments

Monzo wrongly denied refunds to fraud and scam victims

https://www.theguardian.com/money/2026/feb/07/monzo-natwest-hsbc-refunds-fraud-scam-fos-ombudsman
3•tablets•33m ago•1 comments

They were drawn to Korea with dreams of K-pop stardom – but then let down

https://www.bbc.com/news/articles/cvgnq9rwyqno
2•breve•35m ago•0 comments

Show HN: AI-Powered Merchant Intelligence

https://nodee.co
1•jjkirsch•38m ago•0 comments

Bash parallel tasks and error handling

https://github.com/themattrix/bash-concurrent
2•pastage•38m ago•0 comments

Let's compile Quake like it's 1997

https://fabiensanglard.net/compile_like_1997/index.html
2•billiob•39m ago•0 comments

Reverse Engineering Medium.com's Editor: How Copy, Paste, and Images Work

https://app.writtte.com/read/gP0H6W5
2•birdculture•44m ago•0 comments

Go 1.22, SQLite, and Next.js: The "Boring" Back End

https://mohammedeabdelaziz.github.io/articles/go-next-pt-2
1•mohammede•50m ago•0 comments

Laibach the Whistleblowers [video]

https://www.youtube.com/watch?v=c6Mx2mxpaCY
1•KnuthIsGod•51m ago•1 comments

Slop News - The Front Page right now but it's only Slop

https://slop-news.pages.dev/slop-news
1•keepamovin•56m ago•1 comments

Economists vs. Technologists on AI

https://ideasindevelopment.substack.com/p/economists-vs-technologists-on-ai
1•econlmics•58m ago•0 comments

Life at the Edge

https://asadk.com/p/edge
4•tosh•1h ago•0 comments

RISC-V Vector Primer

https://github.com/simplex-micro/riscv-vector-primer/blob/main/index.md
4•oxxoxoxooo•1h ago•1 comments

Show HN: Invoxo – Invoicing with automatic EU VAT for cross-border services

2•InvoxoEU•1h ago•0 comments

A Tale of Two Standards, POSIX and Win32 (2005)

https://www.samba.org/samba/news/articles/low_point/tale_two_stds_os2.html
4•goranmoomin•1h ago•0 comments
Open in hackernews

How big are our embeddings now and why?

https://vickiboykis.com/2025/09/01/how-big-are-our-embeddings-now-and-why/
113•alexmolas•5mo ago

Comments

numlocked•5mo ago
I don’t quite understand. The article says things like:

“With the constant upward pressure on embedding sizes not limited by having to train models in-house, it’s not clear where we’ll slow down: Qwen-3, along with many others is already at 4096”

But aren’t embedding models separate from the LLMs? The size of attention heads in LLMs etc isn’t inherently connected to how a lab might train and release an embedding model. I don’t really understand why growth in LLM size fundamentally puts upward pressure on embedding size as they are not intrinsically connected.

svachalek•5mo ago
All LLMs use embeddings, it's just for embeddings models they stop there, while for a full chat/completion model that's only the first step of the process. Embeddings are coordinates in the latent space of the transformer.
indeed30•5mo ago
I wouldn’t call the embedding layer "separate" from the LLM. It’s learned jointly with the rest of the network, and its dimensionality is one of the most fundamental architectural choices. You’re right though that, in principle, you can pick an embedding size independent of other hyperparameters like number of layers or heads, so I see where you're coming from.

However the embedding dimension sets the rank of the token representation space. Each layer can transform or refine those vectors, but it can’t expand their intrinsic capacity. A tall but narrow network is bottlenecked by that width. Width-first scaling tends to outperform pure depth scaling, you want enough representational richness per token before you start stacking more layers of processing.

So yeah, embedding size doesn’t have to scale up in lockstep with model size, but in practice it usually does, because once models grow deeper and more capable, narrow embeddings quickly become the limiting factor.

numlocked•5mo ago
I hear you, but the article is talking specifically about "embeddings as a product" -- not the embeddings that are within an LLM architecture. It starts:

> As a quick review, embeddings are compressed numerical representations of a variety of features (text, images, audio) that we can use for machine learning tasks like search, recommendations, RAG, and classification.

Current standalone embedding models are not intrinsically connected to SotA LLM architectures (e.g. the Qwen reference) -- right? The article seems to mix the two ideas together.

gojomo•5mo ago
The LLMs need the embedding function, benefit from growth, do the training – and then other uses get that embedding "for free".

So an old down-pressure on sizes – internal training costs & resource limits – now weaker. And as long as LLMs are seeing benefits from larger embeddings, they'll become more common and available. (Of course via truncation/etc, no one is forced to use larger than works for them... but larger may keep becoming more common & available.)

minimaxir•5mo ago
It's the same Jevons paradox reason as why LLMs are so big despite massive diminishing returns. If we can output 4096Ds, why not use all the Ds?

Like LLMs, the bottleneck is still training data and the training regimen, but there's still a demand for smaller embedding models due to both storage and compute concerns. EmbeddingGemma (https://huggingface.co/google/embeddinggemma-300m), released just yesterday, beats the 4096D Qwen-3 benchmarks at 768D, and using the 128D equivalent via MRL beats many 768D embedding models.

fredophile•5mo ago
I'm not an expert on LLMs but my guess would be that this is a result of the curse of dimensionality. As a general rule more dimensions != more better.
Xenoamorphous•5mo ago
Question for the experts: a few years back (even before covid times?) I was tasked with building a news aggregator. Universal Sentence Encoder was new, and we didn’t even have BERT back then. It felt magical (at least as a regular software dev) seeing how the cosine similarity score was heavily correlated with how similar (from a semantic standpoint) two given pieces of text were. That plus some clustering algorithm got the job done.

A few months ago I happened to play with OpenAI’s embeddings model (can’t remember which ones) and I was shocked to see that the cosine similarity of most texts was super close, even if the texts had nothing in common. It’s like the wide 0-1 range that USE (and later BERT) were giving me was compressed to perhaps a 0.2 one. Why is that? Does it mean those embeddings are not great for semantic similarity?

minimaxir•5mo ago
It's likely because the definition of "similar" varies, and it doesn't necessarily mean semantic similarity. Depending on how the embedding model was trained, just texts with a similar format/syntax are indeed "similar" on that axis.

The absolute value of cosine similarity isn't critical (just the order when comparing multiple candidates), but if you finetune an embeddings model for a specific domain, the model will give a wider range of cosine similarity since it can learn which attributes specifically are similar/dissimilar.

teepo•5mo ago
Thanks - that helped it click a bit more. If the relative ordering is correct it doesn't matter they look so compressed.
clickety_clack•5mo ago
That’s not necessarily true. If the embedding model hasn’t been trained on data you care about, then similarity might be dominated by features you don’t care about. Maybe you want documents that reference pizza toppings, but the embedding similarity might actually be dominated by the tone, word complexity, and the use of em dashes as compared to your prompt. That means the relative ordering might not turn out the way you want.
teepo•5mo ago
I was reading somewhere that the BERT and USE style were "big-symantic space" designed to 0.0-1.0 so that things unrelated would be close to 0.0, and are classifiers.

But now, like the OpenAI embedding you're talking about the embedding are constrained, trained for retrieval in mind. The pairs are ordered closer, easier to search.

lmeyerov•5mo ago
We have been on a fork for louie.ai:

* Small data - talk to your PDF on-the-fly etc: Getting bigger & faster via cloud APIs

* Big data - for RAG: Getting smaller, bc we don't want to pay crazy fees for vector DB hosting, and doable bc easier to get higher-quality small embeddings that do that

bitwize•5mo ago
And why do we have to keep embiggening our embeddings?