Thought making one of these cluster maps would be interesting. Used Nomic embeddings, HDBSCAN, and UMAP, with Gemma 3 27B (via Ollama) to label the clusters. Looked into the dataset to find the most active posting times, popular domains that were posted, and other trends from this past year
Write-up and other findings: https://lincolnmaxwell.com/p/clustering-hackernews-2025/
Interactive map: https://hackernews-clustered-2025.labs.lincolnmaxwell.com/
Heads up: the map is about ~20mb in size (~7mb transfer over network) - don’t use on a metered connection
AMA!