frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

LLM Embeddings Explained: A Visual and Intuitive Guide

https://huggingface.co/spaces/hesamation/primer-llm-embedding
159•eric-burel•6h ago•24 comments

Debian switches to 64-bit time for everything

https://www.theregister.com/2025/07/25/y2k38_bug_debian/
124•pseudolus•2h ago•58 comments

Show HN: I made a tool to generate photomosaics with your pictures

https://pictiler.com
27•jakemanger•1h ago•9 comments

Enough AI copilots, we need AI HUDs

https://www.geoffreylitt.com/2025/07/27/enough-ai-copilots-we-need-ai-huds
552•walterbell•14h ago•184 comments

SIMD within a register: How I doubled hash table lookup performance

https://maltsev.space/blog/012-simd-within-a-register-how-i-doubled-hash-table-lookup-performance
107•axeluser•7h ago•11 comments

Trying to play an isomorphic piano (2022) [video]

https://www.youtube.com/watch?v=j4itL174C-4
21•surprisetalk•3d ago•11 comments

Performance and telemetry analysis of Trae IDE, ByteDance's VSCode fork

https://github.com/segmentationf4u1t/trae_telemetry_research
880•segfault22•19h ago•312 comments

VPN use surges in UK as new online safety rules kick in

https://www.ft.com/content/356674b0-9f1d-4f95-b1d5-f27570379a9b
175•mmarian•10h ago•169 comments

Dumb Pipe

https://www.dumbpipe.dev/
816•udev4096•22h ago•197 comments

What would an efficient and trustworthy meeting culture look like?

https://abitmighty.com/posts/the-ultimate-meeting-culture
60•todsacerdoti•5h ago•47 comments

Blender: Beyond Mouse and Keyboard

https://code.blender.org/2025/07/beyond-mouse-keyboard/
196•dagmx•3d ago•54 comments

How I fixed my blog's performance issues by writing a new Jekyll plugin

https://arclight.run/how-i-fixed-my-blogs-performance-issues-by-writing-a-new-jekyll-plugin-jekyll-skyhook/
55•arclight_•3d ago•20 comments

I hacked my washing machine

https://nexy.blog/2025/07/27/how-i-hacked-my-washing-machine/
277•JadedBlueEyes•16h ago•124 comments

Software Development at 800 Words per Minute

https://neurrone.com/posts/software-development-at-800-wpm/
109•ClawsOnPaws•3d ago•38 comments

Terminal app can now run full graphical Linux apps in the latest Android Canary

https://www.androidauthority.com/linux-terminal-graphical-apps-3580905/
15•thunderbong•3d ago•8 comments

Samsung Removes Bootloader Unlocking with One UI 8

https://sammyguru.com/breaking-samsung-removes-bootloader-unlocking-with-one-ui-8/
77•1una•11h ago•55 comments

Making Postgres slower

https://byteofdev.com/posts/making-postgres-slow/
298•AsyncBanana•15h ago•32 comments

200k Flemish drivers can turn traffic lights green

https://www.vrt.be/vrtnws/en/2025/07/24/200-000-flemish-drivers-can-turn-traffic-lights-green-but-waze/
25•svenfaw•3d ago•53 comments

Claude Code Router

https://github.com/musistudio/claude-code-router
98•y1n0•12h ago•22 comments

Ask HN: What are you working on? (July 2025)

201•david927•19h ago•614 comments

Why I write recursive descent parsers, despite their issues (2020)

https://utcc.utoronto.ca/~cks/space/blog/programming/WhyRDParsersForMe
97•blobcode•4d ago•45 comments

ZUSE – The Modern IRC Chat for the Terminal Made in Go/Bubbletea

https://github.com/babycommando/zuse
82•babycommando•14h ago•38 comments

Multiplex: Command-Line Process Mutliplexer

https://github.com/sebastien/multiplex
26•todsacerdoti•7h ago•6 comments

Solid protocol restores digital agency

https://www.schneier.com/blog/archives/2025/07/how-solid-protocol-restores-digital-agency.html
53•speckx•3d ago•29 comments

Big agriculture mislead the public about the benefits of biofuels

https://lithub.com/how-big-agriculture-mislead-the-public-about-the-benefits-of-biofuels/
183•littlexsparkee•11h ago•165 comments

The JJ VCS workshop: A zero-to-hero speedrun

https://github.com/jkoppel/jj-workshop
145•todsacerdoti•1d ago•12 comments

EU age verification app to ban any Android system not licensed by Google

https://www.reddit.com/r/degoogle/s/YxmPgFes8a
859•cft•14h ago•497 comments

Mobile BESS Powers Remote Heavy Equipment

https://spectrum.ieee.org/mobile-bess
9•defrost•3d ago•4 comments

Formal specs as sets of behaviors

https://surfingcomplexity.blog/2025/07/26/formal-specs-as-sets-of-behaviors/
39•gm678•1d ago•6 comments

How big can I print my image?

https://maurycyz.com/misc/printing/
34•LorenDB•3d ago•5 comments
Open in hackernews

LLM Embeddings Explained: A Visual and Intuitive Guide

https://huggingface.co/spaces/hesamation/primer-llm-embedding
159•eric-burel•6h ago

Comments

carschno•4h ago
Nice explanations! A (more advanced) aspect which I find missing would be the difference between encoder-decoder transformer models (BERT) and "decoder-only", generative models, with respect to the embeddings.
dust42•2h ago
Minor correction, BERT is an encoder (not encoder-decoder), ChatGPT is a decoder.

Encoders like BERT produce better results for embeddings because they look at the whole sentence, while GPTs look from left to right:

Imagine you're trying to understand the meaning of a word in a sentence, and you can read the entire sentence before deciding what that word means. For example, in "The bank was steep and muddy," you can see "steep and muddy" at the end, which tells you "bank" means the side of a river (aka riverbank), not a financial institution. BERT works this way - it looks at all the words around a target word (both before and after) to understand its meaning.

Now imagine you have to understand each word as you read from left to right, but you're not allowed to peek ahead. So when you encounter "The bank was..." you have to decide what "bank" means based only on "The" - you can't see the helpful clues that come later. GPT models work this way because they're designed to generate text one word at a time, predicting what comes next based only on what they've seen so far.

Here is a link also from huggingface, about modernBERT which has more info: https://huggingface.co/blog/modernbert

Also worth a look: neoBERT https://huggingface.co/papers/2502.19587

jasonjayr•1h ago
As an extreme example that can (intentionally) confuse even human readers, see https://en.wikipedia.org/wiki/Garden-path_sentence
ubutler•1h ago
Further to @dust42, BERT is an encoder, GPT is a decoder, and T5 is an encoder-decoder.

Encoder-decoders are not in vogue.

Encoders are favored for classification, extraction (eg, NER and extractive QA) and information retrieval.

Decoders are favored for text generation, summarization and translation.

Recent research (see, eg, the Ettin paper: https://arxiv.org/html/2507.11412v1 ) seems to confirm the previous understanding that encoders are indeed better for “encoder task” and vice-versa.

Fundamentally, both are transformers and so an encoder could be turned into a decoder or a decoder could be turned into an encoder.

The design difference comes down to bidirectional (ie, all tokens can attend to all other tokens) versus autoregressive attention (ie, the current token can only attend to the previous tokens).

petesergeant•4h ago
I wrote a simpler explanation still, that follows a similar flow, but approaches it from more of a "problems to solve" perspective: https://sgnt.ai/p/embeddings-explainer/
k__•1h ago
Awesome, thanks!

If I understand this correctly, there are three major problems with LLMs right now.

1. LLMs reduce a very high-dimensional vector space into a very low-dimensional vector space. Since we don't know what the dimensions in the low-dimensional vector space mean, we can only check that the outputs are correct most of the time.

What research is happening to resolve this?

2. LLMs use written texts to facilitate this reduction. So, they don't learn from reality, but from what humans written down about reality.

It seems like Keen Technologies tries to avoid this issue, by using (simple) robots with sensors for training, instead of human text. Which seems a much slower process, but could yield more accurate models in the long run.

3. LLMs holds internal state as a vector that reflects the meaning and context of the "conversation". Which explains, why the quality of responses deteriorates with longer conversations, if one vector is "stamped over" again and again, the meaning of the first "stamps" will get blurred.

Are there alternative ways of holding state or is the only way around this to back up that state vector at every point an revert if things go awry?

agentcoops•13m ago
Apologies if this comes across as too abstract, but I think your comment raises really important questions.

(1) While studying the properties of the mathematical objects produced is important, I don't think we should understand the situation you describe as a problem to be solved. In old supervised machine learning methods, human beings were tasked with defining the rather crude 'features' of relevance in a data/object domain, so each dimension had some intuitive significance (often binary 'is tall', 'is blue' etc). The question now is really about learning the objective geometry of meaning, so the dimensions of the resultant vector don't exactly have to be 'meaningful' in the same way -- and, counter-intuitive as it may seem, this is progress. Now the question is of the necessary dimensionality of the mathematical space in which semantic relations can be preserved -- and meaning /is/ in some fundamental sense the resultant geometry.

(2) This is where the 'Platonic hypothesis' research [1] is so fascinating: empirically we have found that the learned structures from text and image converge. This isn't saying we don't need images and sensor robots, but it appears we get the best results when training across modalities (language and image, for example). This is really fascinating for how we understand language. While any particular text might get things wrong, the language that human beings have developed over however many thousands of years really does seem to do a good job of breaking out the relevant possible 'features' of experience. The convergence of models trained from language and image suggests a certain convergence between what is learnable from sensory experience of the world and the relations that human beings have slowly come to know through the relations between words.

[1] https://phillipi.github.io/prh/ and https://arxiv.org/pdf/2405.07987

dotancohen•3h ago
One of the first sentences of the page clearly states:

  > This blog post is recommended for desktop users.
That said, there is a lot of content here that could have been mobile-friendly with very little effort. The first image, of embeddings, is a prime example. It has been a very long time since I've seen any online content, let alone a blog post, that requires a desktop browser
fastball•27m ago
> Please don't complain about tangential annoyances—e.g. article or website formats, name collisions, or back-button breakage. They're too common to be interesting.

https://news.ycombinator.com/newsguidelines.html

stirfish•13m ago
This is interesting, and I'm curious how it came to be that way.

>If your ears are more important than your eyes, you can listen to the podcast version of this article generated by NotebookLM.

It looks like an LLM would read it to you; I wonder if one could have made it mobile-friendly.

lynx97•3h ago
Shameless plug: If you want to experiment with semantic search for the pages you visit: https://github.com/mlang/llm-embed-proxy -- a intercepting proxy as a `llm` plugin.
smcleod•3h ago
Seems to be down?

Lots of console errors with the likes of "Content-Security-Policy: The page’s settings blocked an inline style (style-src-elem) from being applied because it violates the following directive: “style-src 'self'”." etc...

nycdatasci•3h ago
If you want to see many more than 50 words and also have an appreciation for 3D data visualization check out embedding projector (no affiliation): https://projector.tensorflow.org/
bob_theslob646•2h ago
If someone enjoyed learning about this, where should I suggest they start to learn more about embeddings?
ayhanfuat•2h ago
Vicki Boykis wrote a small book about it: https://vickiboykis.com/what_are_embeddings/
boulevard•1h ago
This is a great visual guide! I’ve also been working on a similar concept focused on deep understanding - a visual + audio + quiz-driven lesson on LLM embeddings, hosted on app.vidyaarthi.ai.

https://app.vidyaarthi.ai/ai-tutor?session_id=C2Wr46JFIqslX7...

Our goal is to make abstract concepts more intuitive and interactive — kind of like a "learning-by-doing" approach. Would love feedback from folks here.

(Not trying to self-promote — just sharing a related learning tool we’ve put a lot of thought into.)

amelius•1h ago
If LLMs are so smart, then why can't they run directly on 8bit ascii input rather than tokens based on embeddings?
pornel•1h ago
This isn't about smarts, but about performance and memory usage.

Tokens are a form of compression, and working on uncompressed representation would require more memory and more processing power.

amelius•46m ago
The opposite is true. Ascii and English are pretty good at compressing. I can say "cat" with just 24 bits. Your average LLM token embedding uses on the order of kilobits internally.
blutfink•26m ago
The LLM can also “say” “cat” with few bits. Note that the meaning of the word as stored in your brain takes more than 24 bits.
amelius•8m ago
No, an LLM really uses __much__ more bits per token.

First, the embedding typically uses thousands of dimensions.

Then, the value along each dimension is represented with a floating point number which will take 16 bits (can be smaller though with higher quantization).

montebicyclelo•1h ago
Nice tutorial — the contextual vs static embeddings is the important point; many are familiar with word2vec (static), but contextual embeddings are more powerful for many tasks.

(However, there seems to be some serious back-button / browser history hijacking on this page.. Just scolling down the page appends a ton to my browser history, which is lame.)

khalic•54m ago
What a didactic and well built article! My thanks to the author
eric-burel•22m ago
Author's profile on Huggingface: https://huggingface.co/hesamation HN mods suggested me to repost after a less successful share. I especially liked this article because the author goes through different types of embeddings rather than sticking to the definition.