frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Real-time system that tracks how news spreads across 200k websites

https://yandori.io/news-flow/
1•antiochIst•35m ago
I built a system that monitors ~200,000 news RSS feeds in near real-time and clusters related articles to show how stories spread across the web.

It uses Snowflake’s Arctic model for embeddings and HNSW for fast similarity search. Each “story cluster” shows who published first, how fast it propagated, and how the narrative evolved as more outlets picked it up.

Would love feedback on the architecture, scaling approach, and any ways to make the clusters more accurate or useful.

Live demo: https://yandori.io/news-flow/

Comments

masterphai•29m ago
Interesting project - it’s rare to see news-flow tracking done in real time at this scale. One thing you may want to stress-test is how stable the clustering remains when stories evolve semantically over a few hours. Embeddings tend to drift as outlets rewrite or localize a piece, and HNSW can sometimes over-merge when the centroid shifts.

A trick that helped in a similar system I built was doing a second-pass “temporal coherence” check: if two articles are close in embedding space but far apart in publish time or share no common entities, keep them in adjacent clusters rather than forcing a merge. It reduced false positives significantly.

Also curious how you handle deduping syndicated content - AP/Reuters can dominate the embedding space unless you weight publisher identity or canonical URLs.

Overall, really nice work. The propagation timeline is especially useful.

CS QLola

https://news.ycombinator.com
1•bappaforjio•1m ago•0 comments

Lifetime Safety in Clang – 2025 US LLVM Developers' Meeting [video]

https://www.youtube.com/watch?v=3zWK7Lx96vI
1•matt_d•4m ago•0 comments

Joe Armstrong – The mess we are in

https://youtu.be/lKXe3HUG2l4?si=YEbsd9xOCH_yP_C2
1•lifeisstillgood•8m ago•0 comments

Ask HN: Hard and deep tech – why are Jira and Confluence the go-to PM tools?

1•dnlh_lvg•8m ago•1 comments

Dr. Chainlove Or: How I Learned to Stop Worrying and Love On-Chain Gaming

https://organizedplayer.substack.com/p/dr-chainlove-or-how-i-learned-to
1•0north•9m ago•0 comments

Prosecutor Used Flawed A.I. To Keep a Man in Jail, His Lawyers Say

https://www.nytimes.com/2025/11/25/us/prosecutor-artificial-intelligence-errors-lawyers-californi...
3•perihelions•10m ago•0 comments

BebboSSH: SSH2 implementation for Amiga systems (68000, GPLv3)

https://franke.ms/git/bebbo/bebbossh
1•snvzz•11m ago•0 comments

Genesis Mission – A National Mission to Accelerate Science Through AI

https://genesis.energy.gov/
1•Anon84•14m ago•0 comments

Design Follows Data Structures

https://www.tedinski.com/2019/01/29/data-structures-are-fundamental.html
2•plutonium3345•16m ago•0 comments

Maybe some people should just give up [video]

https://www.youtube.com/watch?v=rsoEipuwXiI
1•koakuma-chan•19m ago•0 comments

I tracked 609 food additives across 817K products to find awareness gaps

https://compareadditives.com
4•markvitals•19m ago•2 comments

GrapheneOS ceases operations in France amid pressure and legal threats

https://alternativeto.net/news/2025/11/grapheneos-ceases-operations-in-france-amid-pressure-and-l...
2•airhangerf15•22m ago•0 comments

Are LLMs the Best That They Will Ever Be?

https://asimovaddendum.substack.com/p/are-llms-the-best-that-they-will
3•rufusrock•22m ago•2 comments

Scientists can now watch metal crystals grow inside liquid metal

https://theconversation.com/scientists-can-now-watch-metal-crystals-grow-inside-liquid-metal-270451
3•billybuckwheat•29m ago•0 comments

Automating Linux Backups with Rsync: A Set-and-Forget Strategy

https://orioninsist.org/blog/linux-automated-backup-rsync-guide/
1•orioninsist•33m ago•0 comments

Show HN: Free macro dashboards with downloadable charts (e.g., EUR/USD)

https://fxmacrodata.com/dashboard/EUR_USD
1•roberttidball•35m ago•1 comments

Show HN: Real-time system that tracks how news spreads across 200k websites

https://yandori.io/news-flow/
1•antiochIst•35m ago•1 comments

Credits Are Not It

https://hengar.pika.page/posts/credits-are-not-it
2•hengar•38m ago•0 comments

Space: 1999 – Special Effects Techniques

https://catacombs.space1999.net/main/pguide/upsfx.html
6•exvi•43m ago•0 comments

Show HN: Runtime Verification for SQL Agents

https://github.com/yudduy/sql_exenv
1•yudduy•45m ago•1 comments

In Praise of Bibliomania

https://lithub.com/nothing-better-than-a-whole-lot-of-books-in-praise-of-bibliomania/
3•bookofjoe•46m ago•0 comments

Other Winfield Creations (2002)

https://c-we.com/piranha/page9.htm
1•exvi•47m ago•0 comments

GM Reward Loophole Explained: Cars Paid Off in Seconds

https://resellcalendar.com/news/news/gm-reward-loophole-explained-cars-paid-off-in-seconds/
3•typeofhuman•47m ago•1 comments

Deconstructing the Spinner: A One on One chat with Gene Winfield (2000)

https://media.bladezone.com/contents/film/interviews/gene-winfield/
1•exvi•49m ago•0 comments

A man who's been waiting in jail for his day in court for 6 years

https://substack.com/inbox/post/178811090
3•msdrigg•49m ago•1 comments

Singapore orders Apple, Google to prevent gov spoofing on messaging platforms

https://www.reuters.com/world/asia-pacific/singapore-orders-apple-google-prevent-government-spoof...
3•phantomathkg•52m ago•0 comments

Proton Meet: Secure, end-to-end encrypted video conferencing

https://proton.me/meet
15•absqueued•1h ago•2 comments

Mystery of the Quintic

https://youtu.be/9HIy5dJE-zQ
1•surprisetalk•1h ago•0 comments

Show HN: Agentic Arena – 52 tasks implemented by Opus 4.5, Gemini 3, and GPT-5.1

https://arena.logic.inc/
1•sgk284•1h ago•2 comments

Employee quits job over an Nvidia RTX 5060

https://www.tomshardware.com/pc-components/gpus/employee-quits-job-over-an-nvidia-rtx-5060-intern...
16•R_Uttam•1h ago•14 comments