frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
473•klaussilveira•7h ago•116 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
812•xnx•12h ago•487 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
157•isitcontent•7h ago•17 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
156•dmpetrov•7h ago•67 comments

How we made geo joins 400× faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
32•matheusalmeida•1d ago•1 comments

A century of hair samples proves leaded gas ban worked

https://arstechnica.com/science/2026/02/a-century-of-hair-samples-proves-leaded-gas-ban-worked/
91•jnord•3d ago•12 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
50•quibono•4d ago•6 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
260•vecti•9h ago•122 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
207•eljojo•10h ago•134 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
328•aktau•13h ago•158 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
327•ostacke•13h ago•86 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
411•todsacerdoti•15h ago•219 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
22•kmm•4d ago•1 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
337•lstoll•13h ago•241 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
52•phreda4•6h ago•9 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
4•romes•4d ago•0 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
195•i5heu•10h ago•144 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
115•vmatsiiako•12h ago•38 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
152•limoce•3d ago•79 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
244•surprisetalk•3d ago•32 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
996•cdrnsf•16h ago•420 comments

FORTH? Really!?

https://rescrv.net/w/2026/02/06/associative
46•rescrv•15h ago•17 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
25•gfortaine•5h ago•3 comments

I'm going to cure my girlfriend's brain tumor

https://andrewjrod.substack.com/p/im-going-to-cure-my-girlfriends-brain
67•ray__•3h ago•28 comments

Evaluating and mitigating the growing risk of LLM-discovered 0-days

https://red.anthropic.com/2026/zero-days/
38•lebovic•1d ago•11 comments

Show HN: Smooth CLI – Token-efficient browser for AI agents

https://docs.smooth.sh/cli/overview
78•antves•1d ago•59 comments

How virtual textures work

https://www.shlom.dev/articles/how-virtual-textures-really-work/
30•betamark•14h ago•28 comments

Show HN: Slack CLI for Agents

https://github.com/stablyai/agent-slack
41•nwparker•1d ago•11 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
7•gmays•2h ago•2 comments

Evolution of car door handles over the decades

https://newatlas.com/automotive/evolution-car-door-handle/
41•andsoitis•3d ago•62 comments
Open in hackernews

Visualizing GPT-OSS-20B embeddings

https://melonmars.github.io/LatentExplorer/embedding_viewer.html
89•melonmars•5mo ago

Comments

kingstnap•5mo ago
It's an interesting looking plot I suppose.

My guess is its the 2 largest principle components of the embedding.

But none of the points are labelled? There isn't a writeup on the page or anything?

jablongo•5mo ago
Usually PCA doesn't look quite like this so this is likely done using TSNE or UMAP, which are non parametric embeddings (they optimize a loss by modifying the embedded points directly). I can see labels if I mouseover the dots.
terhechte•5mo ago
I can see the labels when I hover with the pointer
graphviz•5mo ago
What do people learn from visualizations like this?

What is the most important problem anyone has solved this way?

Speaking as somewhat of a co-defendant.

jablongo•5mo ago
I lets you inspect what actually constitutes a given cluster, for example it seems like the outer clusters are variations of individual words and their direct translations, rather than synonyms (the ones I saw at least).
minimaxir•5mo ago
Not everything has to be directly informative or solve a problem. Sometimes data visualization can look pretty for pretty's sake.

Dimensionality reduction/clustering like this may be less useful for identifying trends in token embeddings, but for other types of embeddings it's extremely useful.

diwank•5mo ago
Agreed. The fact that it has any structure at all is fascinating (and super pretty). Could signal at interesting internal structures. I would love to see a version for Qwen-3 and Mistral too!

I wonder if being trained on significant amounts of synthetic data gave it any unique characteristics.

TuringNYC•5mo ago
> What do people learn from visualizations like this?

Applying the embeddings model to some dataset of yours of interest, and then a similar visualization, is where it gets cool because you can visually look at clusters and draw conclusions about the closeness of items in your own dataset

ethan_smith•5mo ago
Embedding visualizations have helped identify bias in word embeddings (Word2Vec), debug entity resolution systems, and optimize document retrieval by revealing semantic clusters that inform better indexing strategies.
graphviz•5mo ago
Interesting, glad to know it's been useful for some specific contributions. (Not questioning that interesting-looking, appealing displays as overviews for general awareness are also worthwhile.)
_def•5mo ago
I have the suspicion that this is how GPT-OSS-20B would generate a visualization of it's embeddings. Happy to learn otherwise.
eddywebs•5mo ago
Cool ! Would it possible to generate visualizations of any given open weight model out there ?
minimaxir•5mo ago
Yes, it's just yoinking the weights out of the embeddings layer.
numpad0•5mo ago
Is this handling Unicode correctly? Seems like a lot of even Latin alphabets are getting mangled.
int_19h•5mo ago
It looks like it's not handling UTF-8 at all and displaying it as if it were Latin-1
mkl•5mo ago
I don't think it's actually UTF-8. The data is at https://melonmars.github.io/LatentExplorer/embeddings_2d.jso... and contains things like

  "\u00e0\u00a7\u012d\u00e0\u00a6\u013e"
with some characters > 0xff (but none above 0x0143, weirdly).
ashvardanian•5mo ago
Any good comparisons of traditional embedding models against embeddings derived from autoregressive language models?
minimaxir•5mo ago
They are incomparable. Token embeddings generated with something like word2vec worked well because the networks are shallow and therefore the learned semantic data can be contained solely and independently within the embeddings themselves. Token embeddings as a part of an LLM (e.g. gpt-oss-20b) are conditioned on said LLM and do not have fully independent learned data, although as shown here there still can be some relationships preserved.

Embeddings derived from autoregressive language models apply full attention mechanisms to get something different entirely.

lawlessone•5mo ago
what does it mean that some embeddings are close to others in this space?

That they're related or connected or it arbitrary?

Why does it look like a fried egg?

edit: must be related in some way as one of the "droplets" in the bottom left quadrant seems to consist of various versions of the word "parameter"

minimaxir•5mo ago
Typically these algorithms cluster by similarity (either euclidian or cosine).

The density of the clusters tend to have trends. In this case, the "yolk" has a lot of bizarre unicode tokens.

esafak•5mo ago
Without a way to tune it, this visualization is as much about the dimensionality reduction algorithm used as the embeddings themselves, because trade-offs are unavoidable when you go from a very high dimensional space to a 2D one. I would not read too much into it.
promiseofbeans•5mo ago
This demo is a lot more useful for comparing word embeddings: https://www.cs.cmu.edu/~dst/WordEmbeddingDemo/index.html

You can choose which dimensions to show, pick which embeddings to show, and play with vector maths between them in a visual way

It doesn't show the whole set of embeddings, though I am sure someone could fix that, as well as adapting it to use the gpt-oss model instead of the custom (?) mini set it uses.

voodooEntity•5mo ago
@Author i would recommend you to give

https://github.com/vasturiano/3d-force-graph

a try, for the text labels you can use

https://github.com/vasturiano/three-spritetext

its based on Three.js and creates great 3D graph visualisations GPU rendered (webgl). This could make it alot more interresting to watch because it could display actual depth (your gpu is gonne run hot but i guess worth it)

just a suggestion.

suprjami•5mo ago
Why does it look like an image of an asteroid hitting a planet?

https://stock.adobe.com/images/asteroid-hitting-the-earth-ai...