frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Jeffrey Epstein's digital cleanup crew

https://www.theverge.com/report/876081/jeffrey-epstein-files-seo-google-digital-footprint-emails
1•imartin2k•4s ago•0 comments

Real-time Reddit sentiment tracker for stock trading

https://www.wsbsentiment.com/
1•shawnmfarnum•22s ago•1 comments

Trump's War on History

https://www.motherjones.com/politics/2026/02/america-freedom-task-force-250-trump-anniversary-his...
1•leotravis10•22s ago•0 comments

Quitting .NET after 22 years

https://www.thatsoftwaredude.com/content/14253/quitting-dot-net-after-22-years
1•Waltz1•1m ago•0 comments

Is human collaboration the answer to the skill formation risks by AI?

https://www.gethopp.app/blog/pair-prompting
1•iparaskev•5m ago•0 comments

Microsoft Should Watch the Expanse

https://idiallo.com/blog/microsoft-should-watch-the-expanse
1•nomdep•5m ago•0 comments

Show HN: Cosmic CLI – Build, deploy, and manage apps from your terminal with AI

https://github.com/cosmicjs/cli
1•tonyspiro•5m ago•0 comments

AgentLogs: Open-source observability for AI coding agents

https://github.com/agentlogs/agentlogs
1•tosh•6m ago•0 comments

WordCatcher

https://wordwalker.ca/games/word-catcher/
1•petedrinnan•6m ago•0 comments

Breakthrough pancreatic cancer therapy blocks tumor resistance in mice

https://www.pnas.org/doi/10.1073/pnas.2523039122
1•DpdC•7m ago•0 comments

Show HN: Multimodal perception system for real-time conversation

https://raven.tavuslabs.org
2•mert_gerdan•8m ago•1 comments

Heuristics for lab robotics, and where its future may go

https://www.owlposting.com/p/heuristics-for-lab-robotics-and-where
1•abhishaike•9m ago•0 comments

Show HN: Traction – Security readiness framework for scaling SaaS teams

https://traction.fyi
1•ERROR_0x06•10m ago•0 comments

Crossview v3.5.0 – New auth modes (header / none), no DB required for proxy auth

https://github.com/corpobit/crossview
1•moeidheidari•10m ago•1 comments

Show HN: Tasty A.F. – Turn Any Online Recipe into a 3x5 Notecard

https://tastyaf.recipes
1•adammfrank•10m ago•0 comments

Photoswitching for chromocontrol of TRPC4/5 channel functions in live tissues

https://www.nature.com/articles/s41589-025-02085-x
2•PaulHoule•11m ago•0 comments

This feels so reminiscent of the whimsical times in tech

https://www.tryroro.com/code
2•songqipu•13m ago•1 comments

Hello, Dada

https://smallcultfollowing.com/babysteps/blog/2026/02/09/hello-dada/
2•ibobev•13m ago•0 comments

Expectation and Copysets

https://buttondown.com/jaffray/archive/expectation-and-copysets/
2•ibobev•14m ago•0 comments

LLMCode Lab – Compare up to 5 LLMs side-by-side, then fuse the best answers

https://LLMCode.ai
2•cmeshare•14m ago•2 comments

BurgerDisk Tests

https://www.colino.net/wordpress/archives/2026/02/08/burgerdisk-tests/
2•ibobev•14m ago•0 comments

In praise of the dad joke (2023)

https://wit.substack.com/p/the-familiar-patter-of-the-paterfamilias
2•NaOH•15m ago•0 comments

Looking for feedback from someone who hired technical freelancers earlier

2•yusufhgmail•16m ago•0 comments

Update on Update [video]

https://www.youtube.com/watch?v=M-ZLz8Wg34s
2•tosh•16m ago•0 comments

USDA's reputation suffers after revisions in US corn acres

https://www.reuters.com/business/usdas-reputation-suffers-after-massive-revisions-us-corn-acres-2...
3•DustinEchoes•16m ago•0 comments

Updating the Expiring Secure Boot Certificates Is Sure to Go Without a Hitch

https://pcper.com/2026/02/updating-the-expiring-secure-boot-certificates-is-sure-to-go-without-a-...
2•speckx•17m ago•0 comments

'We feel it in our bones': Can a machine ever love you?

https://www.bbc.com/future/article/20260209-can-a-machine-ever-love-you
4•devonnull•18m ago•0 comments

Google hit by European publishers' complaint to EU over AI Overviews

https://www.reuters.com/world/european-publishers-council-files-eu-antitrust-complaint-about-goog...
3•thm•20m ago•0 comments

Writing RSS reader in 80 lines of bash

https://yobibyte.github.io/yr.html
3•sharjeelsayed•20m ago•0 comments

Simulated phishing test f#%k off

https://github.com/orsifrancesco/simulated-phishing-test-list
2•orsifrancesco•20m ago•1 comments
Open in hackernews

Runtime validation is still fucked in AI coding agents

1•sebringj•1h ago
AI agents (Cursor, Claude computer-use, Copilot agent mode, etc.) have gotten stupidly good at spitting out code. Prompt → boom, clean code. The marketing says "it just works."

It fucking doesn't.

You run it in a real app and immediately hit the same bullshit wall every time: - Hallucinated logic only reveals itself under real data or edge cases - UI updates magically forget to sync across devices (mobile → web = sad trombone) - API calls quietly return 401s or other crap that gets swallowed in some lazy try-catch - Vision-based agents crawl like molasses (2–10s per action) and torch tokens like it's free - Background pings and unrelated fetches make it impossible to tell what actually caused what

I tried pretty much everything out there and none of it quite scratched the itch I had: fast, structured, cross-platform runtime visibility without vision bloat or having to wire up a ton of hooks.

Quick rundown of the usual suspects:

- Pure vision/computer-use (Claude 3.5/4, ADEPT-style): zero setup, works on anything — but latency from hell and token burn is brutal for anything longer than a demo - Playwright / browser MCP servers: fast and structured for web — but web-only, selectors shatter like glass, no native mobile - Appium + vision hybrids: cross-platform on paper — but still vision-dependent and setup is a pain - Sandboxed agents (OpenHands, SWE-agent): decent for repo tasks and shell stuff — not so much for live app UI/network state - Explicit hooks/bridges: precise when you bother adding them — but requires code changes, which sucks

Couldn't find anything that gave me low-latency structured JSON state (UI elements, network, errors, logs) across platforms, local-first, without the usual trade-offs. So yeah, I got fed up and built a small local MCP server to solve it for myself.

Full disclosure: it's called Autonomo MCP https://github.com/sebringj/autonomo — very early, just launched.

I don't usually do this "I built a thing" thing — my open-source contributions are mostly small fixes and PRs — but I honestly couldn't see a better way in the current landscape.

It is my hope that Anthropic (or someone) will eventually ship a clean native solution for this. They already fixed BM25 tool calling to shrink context like crazy; I'd love to see them (or the industry) make runtime validation "just work" out of the box too.

Sometimes when you code in a vacuum you think your shit smells good. lmk if I'm off base here, I grew up with a mean grandpa so I'm cool with it.

Comments

GahLak•1h ago
You've nailed the real friction point that demos gloss over: agents are great at generation but terrible at verification in production systems. The vision latency tax is brutal once you hit real workflows.
sebringj•1h ago
ya, for real, my boss was like let's do e2e testing with AI, look for solutions out there... then like 2 days later he's like wtf is this bill, and i was like you wanted that right? Was using vision calls in azure foundry and was like over 100 bucks or something just in 2 days of me setting it up and trying it out with all the test cases it had.