frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

LLM Embeddings Explained: A Visual and Intuitive Guide

https://huggingface.co/spaces/hesamation/primer-llm-embedding
131•eric-burel•5h ago

Comments

carschno•3h ago
Nice explanations! A (more advanced) aspect which I find missing would be the difference between encoder-decoder transformer models (BERT) and "decoder-only", generative models, with respect to the embeddings.
dust42•1h ago
Minor correction, BERT is an encoder (not encoder-decoder), ChatGPT is a decoder.

Encoders like BERT produce better results for embeddings because they look at the whole sentence, while GPTs look from left to right:

Imagine you're trying to understand the meaning of a word in a sentence, and you can read the entire sentence before deciding what that word means. For example, in "The bank was steep and muddy," you can see "steep and muddy" at the end, which tells you "bank" means the side of a river (aka riverbank), not a financial institution. BERT works this way - it looks at all the words around a target word (both before and after) to understand its meaning.

Now imagine you have to understand each word as you read from left to right, but you're not allowed to peek ahead. So when you encounter "The bank was..." you have to decide what "bank" means based only on "The" - you can't see the helpful clues that come later. GPT models work this way because they're designed to generate text one word at a time, predicting what comes next based only on what they've seen so far.

Here is a link also from huggingface, about modernBERT which has more info: https://huggingface.co/blog/modernbert

Also worth a look: neoBERT https://huggingface.co/papers/2502.19587

jasonjayr•58m ago
As an extreme example that can (intentionally) confuse even human readers, see https://en.wikipedia.org/wiki/Garden-path_sentence
ubutler•1h ago
Further to @dust42, BERT is an encoder, GPT is a decoder, and T5 is an encoder-decoder.

Encoder-decoders are not in vogue.

Encoders are favored for classification, extraction (eg, NER and extractive QA) and information retrieval.

Decoders are favored for text generation, summarization and translation.

Recent research (see, eg, the Ettin paper: https://arxiv.org/html/2507.11412v1 ) seems to confirm the previous understanding that encoders are indeed better for “encoder task” and vice-versa.

Fundamentally, both are transformers and so an encoder could be turned into a decoder or a decoder could be turned into an encoder.

The design difference comes down to bidirectional (ie, all tokens can attend to all other tokens) versus autoregressive attention (ie, the current token can only attend to the previous tokens).

petesergeant•3h ago
I wrote a simpler explanation still, that follows a similar flow, but approaches it from more of a "problems to solve" perspective: https://sgnt.ai/p/embeddings-explainer/
k__•59m ago
Awesome, thanks!

If I understand this correctly, there are three major problems with LLMs right now.

1. LLMs reduce a very high-dimensional vector space into a very low-dimensional vector space. Since we don't know what the dimensions in the low-dimensional vector space mean, we can only check that the outputs are correct most of the time.

What research is happening to resolve this?

2. LLMs use written texts to facilitate this reduction. So, they don't learn from reality, but from what humans written down about reality.

It seems like Keen Technologies tries to avoid this issue, by using (simple) robots with sensors for training, instead of human text. Which seems a much slower process, but could yield more accurate models in the long run.

3. LLMs holds internal state as a vector that reflects the meaning and context of the "conversation". Which explains, why the quality of responses deteriorates with longer conversations, if one vector is "stamped over" again and again, the meaning of the first "stamps" will get blurred.

Are there alternative ways of holding state or is the only way around this to back up that state vector at every point an revert if things go awry?

dotancohen•2h ago
One of the first sentences of the page clearly states:

  > This blog post is recommended for desktop users.
That said, there is a lot of content here that could have been mobile-friendly with very little effort. The first image, of embeddings, is a prime example. It has been a very long time since I've seen any online content, let alone a blog post, that requires a desktop browser
lynx97•2h ago
Shameless plug: If you want to experiment with semantic search for the pages you visit: https://github.com/mlang/llm-embed-proxy -- a intercepting proxy as a `llm` plugin.
smcleod•2h ago
Seems to be down?

Lots of console errors with the likes of "Content-Security-Policy: The page’s settings blocked an inline style (style-src-elem) from being applied because it violates the following directive: “style-src 'self'”." etc...

nycdatasci•2h ago
If you want to see many more than 50 words and also have an appreciation for 3D data visualization check out embedding projector (no affiliation): https://projector.tensorflow.org/
bob_theslob646•2h ago
If someone enjoyed learning about this, where should I suggest they start to learn more about embeddings?
ayhanfuat•2h ago
Vicki Boykis wrote a small book about it: https://vickiboykis.com/what_are_embeddings/
boulevard•1h ago
This is a great visual guide! I’ve also been working on a similar concept focused on deep understanding - a visual + audio + quiz-driven lesson on LLM embeddings, hosted on app.vidyaarthi.ai.

https://app.vidyaarthi.ai/ai-tutor?session_id=C2Wr46JFIqslX7...

Our goal is to make abstract concepts more intuitive and interactive — kind of like a "learning-by-doing" approach. Would love feedback from folks here.

(Not trying to self-promote — just sharing a related learning tool we’ve put a lot of thought into.)

amelius•1h ago
If LLMs are so smart, then why can't they run directly on 8bit ascii input rather than tokens based on embeddings?
pornel•26m ago
This isn't about smarts, but about performance and memory usage.

Tokens are a form of compression, and working on uncompressed representation would require more memory and more processing power.

amelius•10m ago
The opposite is true. Ascii and English are pretty good at compressing. I can say "cat" with just 24 bits. Your average LLM token embedding uses on the order of kilobits internally.
montebicyclelo•27m ago
Nice tutorial — the contextual vs static embeddings is the important point; many are familiar with word2vec (static), but contextual embeddings are more powerful for many tasks.

(However, there seems to be some serious back-button / browser history hijacking on this page.. Just scolling down the page appends a ton to my browser history, which is lame.)

khalic•17m ago
What a didactic and well built article! My thanks to the author

Roo Code: AI-powered autonomous coding agent for Visual Studio Code

https://marketplace.visualstudio.com/items?itemName=RooVeterinaryInc.roo-cline
1•thunderbong•1m ago•0 comments

Show HN: A tiny Linux tool for clearing Steam achievements

https://github.com/t9t/clear-steam-achievements
1•t9t•8m ago•0 comments

Show HN: SilentGPT – Terminal ChatGPT Client in C (AES-256, No Telemetry)

https://github.com/SilentPuck/SilentGPT
2•silentpuck•8m ago•0 comments

Numair Faraz

1•kwie•9m ago•0 comments

The Rising Cost of Child and Pet Day Care

https://marginalrevolution.com/marginalrevolution/2025/07/the-rising-cost-of-child-and-pet-day-care.html
3•speckx•11m ago•0 comments

A leap toward lighter, sleeker mixed reality displays

https://news.stanford.edu/stories/2025/07/mixed-reality-displays-artificial-intelligence-holograms-research-innovation
1•Improvement•14m ago•0 comments

Overarch – model your software system as data

https://github.com/soulspace-org/overarch
1•michaelsbradley•14m ago•0 comments

Is there demand for a tool that turns natural language to SQL without database?

1•viewer_midoria•19m ago•0 comments

Planning an Effective Lesson Plan for Elementary School – A Practical Guide

https://schezy.com/blog/planning-a-lesson-plan-for-elementary-school
1•qareena•20m ago•1 comments

Wan 2.2 in ComfyUI

https://blog.comfy.org/p/wan22-day-0-support-in-comfyui
2•bsenftner•20m ago•1 comments

Police and sharks:not knowing how your data is processed can lead you astray

https://blog.engora.com/2025/07/police-and-sharks-how-not-knowing-how.html
1•Vermin2000•22m ago•1 comments

Searching for Secrets in Public GCP Images

https://trufflesecurity.com/blog/guest-post-gcp-cloudquarry-searching-for-secrets-in-public-gcp-images
1•alexcos•22m ago•0 comments

No Moar Cookies

https://paretosecurity.com/blog/no-moar-cookies/
2•zupo•26m ago•0 comments

Show HN: BlockDL – A FOSS neural net sketchpad with shape checking and live code

https://blockdl.com
2•Aryagm•26m ago•0 comments

Air Force rolls out sex and age-neutral fitness test for EOD techs

https://taskandpurpose.com/news/air-force-eod-fitness/
1•PaulHoule•29m ago•0 comments

Configuration for AI coding assistants working with the Linux kernel codebase

https://lore.kernel.org/all/20250725175358.1989323-1-sashal@kernel.org/
1•haunter•30m ago•0 comments

July Pebble Update

https://ericmigi.com/blog/july-pebble-update/
1•robin_reala•31m ago•0 comments

Vi.mock Is a Footgun: Why Vi.spyOn Should Be Your Default

https://laconicwit.com/vi-mock-is-a-footgun-why-vi-spyon-should-be-your-default/
1•bmac•33m ago•0 comments

Why GLP-1s could become the "everything drug"

https://www.axios.com/2025/07/28/wegovy-new-uses-cost-safety
1•toomuchtodo•34m ago•0 comments

Show HN: MCP server that lets Claude Code consult other LLMs

https://github.com/raine/consult-llm-mcp
2•rane•36m ago•1 comments

ICEBlock app creator Joshua Aaron to speak at HOPE hacker conference next month

https://hope.net/talks.html#iceblock
25•aestetix•36m ago•1 comments

UK VPN demand soars after debut of Online Safety Act

https://www.theregister.com/2025/07/28/uk_vpn_demand_soars/
8•rntn•38m ago•4 comments

Show HN: I made an easy website that RSVP avoiding intruders in your event

https://convide.online/en/
1•diogosm•39m ago•0 comments

The Secret Rules of the Terminal

https://jvns.ca/blog/2025/06/24/new-zine--the-secret-rules-of-the-terminal/
2•skibz•39m ago•0 comments

Ask HN: How does one get rich in 2025?

1•roschdal•40m ago•1 comments

The only state without a Walgreens pharmacy – The Hustle

https://thehustle.co/originals/the-only-state-without-a-walgreens-pharmacy
1•rbanffy•42m ago•0 comments

Approve merge requests with your eyes closed

https://blog.jse.li/posts/approval/
2•Bogdanp•43m ago•0 comments

Time in Antarctica

https://en.wikipedia.org/wiki/Time_in_Antarctica
1•TheSilva•43m ago•0 comments

Reminiscing About Retro (Don't Forget about Atari)

https://www.goto10retro.com/p/reminiscing-about-retro
2•rbanffy•44m ago•0 comments

Associated vs. unassociated alpha channel compositing

https://medium.com/@giz51d/channeling-alpha-bd32afbfadfa
1•fanf2•45m ago•0 comments