frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Why there is no official statement from Substack about the data leak

https://techcrunch.com/2026/02/05/substack-confirms-data-breach-affecting-email-addresses-and-pho...
2•witnessme•3m ago•1 comments

Effects of Zepbound on Stool Quality

https://twitter.com/ScottHickle/status/2020150085296775300
1•aloukissas•7m ago•0 comments

Show HN: Seedance 2.0 – The Most Powerful AI Video Generator

https://seedance.ai/
1•bigbromaker•10m ago•0 comments

Ask HN: Do we need "metadata in source code" syntax that LLMs will never delete?

1•andrewstuart•16m ago•1 comments

Pentagon cutting ties w/ "woke" Harvard, ending military training & fellowships

https://www.cbsnews.com/news/pentagon-says-its-cutting-ties-with-woke-harvard-discontinuing-milit...
3•alephnerd•18m ago•1 comments

Can Quantum-Mechanical Description of Physical Reality Be Considered Complete? [pdf]

https://cds.cern.ch/record/405662/files/PhysRev.47.777.pdf
1•northlondoner•19m ago•1 comments

Kessler Syndrome Has Started [video]

https://www.tiktok.com/@cjtrowbridge/video/7602634355160206623
1•pbradv•21m ago•0 comments

Complex Heterodynes Explained

https://tomverbeure.github.io/2026/02/07/Complex-Heterodyne.html
3•hasheddan•22m ago•0 comments

EVs Are a Failed Experiment

https://spectator.org/evs-are-a-failed-experiment/
2•ArtemZ•33m ago•4 comments

MemAlign: Building Better LLM Judges from Human Feedback with Scalable Memory

https://www.databricks.com/blog/memalign-building-better-llm-judges-human-feedback-scalable-memory
1•superchink•34m ago•0 comments

CCC (Claude's C Compiler) on Compiler Explorer

https://godbolt.org/z/asjc13sa6
2•LiamPowell•36m ago•0 comments

Homeland Security Spying on Reddit Users

https://www.kenklippenstein.com/p/homeland-security-spies-on-reddit
3•duxup•39m ago•0 comments

Actors with Tokio (2021)

https://ryhl.io/blog/actors-with-tokio/
1•vinhnx•40m ago•0 comments

Can graph neural networks for biology realistically run on edge devices?

https://doi.org/10.21203/rs.3.rs-8645211/v1
1•swapinvidya•52m ago•1 comments

Deeper into the shareing of one air conditioner for 2 rooms

1•ozzysnaps•54m ago•0 comments

Weatherman introduces fruit-based authentication system to combat deep fakes

https://www.youtube.com/watch?v=5HVbZwJ9gPE
3•savrajsingh•55m ago•0 comments

Why Embedded Models Must Hallucinate: A Boundary Theory (RCC)

http://www.effacermonexistence.com/rcc-hn-1-1
1•formerOpenAI•57m ago•2 comments

A Curated List of ML System Design Case Studies

https://github.com/Engineer1999/A-Curated-List-of-ML-System-Design-Case-Studies
3•tejonutella•1h ago•0 comments

Pony Alpha: New free 200K context model for coding, reasoning and roleplay

https://ponyalpha.pro
1•qzcanoe•1h ago•1 comments

Show HN: Tunbot – Discord bot for temporary Cloudflare tunnels behind CGNAT

https://github.com/Goofygiraffe06/tunbot
2•g1raffe•1h ago•0 comments

Open Problems in Mechanistic Interpretability

https://arxiv.org/abs/2501.16496
2•vinhnx•1h ago•0 comments

Bye Bye Humanity: The Potential AMOC Collapse

https://thatjoescott.com/2026/02/03/bye-bye-humanity-the-potential-amoc-collapse/
3•rolph•1h ago•0 comments

Dexter: Claude-Code-Style Agent for Financial Statements and Valuation

https://github.com/virattt/dexter
1•Lwrless•1h ago•0 comments

Digital Iris [video]

https://www.youtube.com/watch?v=Kg_2MAgS_pE
1•vermilingua•1h ago•0 comments

Essential CDN: The CDN that lets you do more than JavaScript

https://essentialcdn.fluidity.workers.dev/
1•telui•1h ago•1 comments

They Hijacked Our Tech [video]

https://www.youtube.com/watch?v=-nJM5HvnT5k
2•cedel2k1•1h ago•0 comments

Vouch

https://twitter.com/mitchellh/status/2020252149117313349
41•chwtutha•1h ago•7 comments

HRL Labs in Malibu laying off 1/3 of their workforce

https://www.dailynews.com/2026/02/06/hrl-labs-cuts-376-jobs-in-malibu-after-losing-government-work/
4•osnium123•1h ago•1 comments

Show HN: High-performance bidirectional list for React, React Native, and Vue

https://suhaotian.github.io/broad-infinite-list/
2•jeremy_su•1h ago•0 comments

Show HN: I built a Mac screen recorder Recap.Studio

https://recap.studio/
1•fx31xo•1h ago•1 comments
Open in hackernews

Show HN: Changeflow – Giving up on pixel diffs after 10 years of false positives

https://changeflow.com/
1•stevewillbe•1w ago
I've been building website monitoring tools since 2015. The core problem with pixel-diff screenshots: every ad rotation, every layout tweak = alert noise. Legal and compliance teams kept asking "just tell me WHAT changed."

So I rebuilt it. Changeflow extracts semantic changes and summarizes them in plain English:

- "FDA posted new adaptive trial guidance (Jan 15)" - "Competitor raised enterprise pricing 12%" - "9th Circuit issued opinion on arbitration agreements"

Instead of "47 pixels changed in the header region."

THE HARD TECHNICAL PROBLEMS

Scraping any URL (not just specific sites)

Unlike scrapers built for Amazon or LinkedIn, users give us any URL and expect it to work. Our approach:

Delayed-attach pattern: launch Chrome, let page load naturally, poll /json endpoint for title+URL stability, only THEN attach Puppeteer. Bot detection scripts run against a clean browser.

Three-tier fallback: Linux + datacenter proxy (90% of sites) -> Linux + mobile proxy (9%) -> macOS + real hardware (1%). Cache successful routes per-URL. Expensive path rarely fires.

Real Chrome, not Chrome for Testing (fingerprint detectable). On real Mac hardware, disable GPU spoofing entirely - genuine beats fake.

LLM costs at scale

Running AI on every fetch gets expensive. We cut costs 90%:

Strip nav/sidebars/footers before AI call (~60% token reduction). Model tiering: Llama 3.1 8B via Groq for extraction, Gemini Flash Lite for summaries, Claude only when quality matters.

Gemini cache trick: 1024+ token system prompts get 90% discount on repeat calls. Verbose prompts are actually cheaper.

Diffing beyond git diff

Git diff isn't enough. We add MD5 hashes to list items for move detection, use Levenshtein distance to distinguish edits from replacements, and clean temporal noise ("2 days ago") that creates false positives.

STACK

Rails + Postgres, Faktory workers, Node.js browser pool, Claude/Gemini/Llama via OpenRouter, Proxies from GridPanel and SquidProxies.

Happy to answer questions about the scraping, AI, or 10 years of lessons in this space.