frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: Django N+1 Queries Checker

https://github.com/richardhapb/django-check
1•richardhapb•4m ago•1 comments

Emacs-tramp-RPC: High-performance TRAMP back end using JSON-RPC instead of shell

https://github.com/ArthurHeymans/emacs-tramp-rpc
1•todsacerdoti•8m ago•0 comments

Protocol Validation with Affine MPST in Rust

https://hibanaworks.dev
1•o8vm•13m ago•1 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
2•gmays•14m ago•0 comments

Show HN: Zest – A hands-on simulator for Staff+ system design scenarios

https://staff-engineering-simulator-880284904082.us-west1.run.app/
1•chanip0114•15m ago•1 comments

Show HN: DeSync – Decentralized Economic Realm with Blockchain-Based Governance

https://github.com/MelzLabs/DeSync
1•0xUnavailable•20m ago•0 comments

Automatic Programming Returns

https://cyber-omelette.com/posts/the-abstraction-rises.html
1•benrules2•23m ago•1 comments

Why Are There Still So Many Jobs? The History and Future of Workplace Automation [pdf]

https://economics.mit.edu/sites/default/files/inline-files/Why%20Are%20there%20Still%20So%20Many%...
2•oidar•25m ago•0 comments

The Search Engine Map

https://www.searchenginemap.com
1•cratermoon•32m ago•0 comments

Show HN: Souls.directory – SOUL.md templates for AI agent personalities

https://souls.directory
1•thedaviddias•34m ago•0 comments

Real-Time ETL for Enterprise-Grade Data Integration

https://tabsdata.com
1•teleforce•37m ago•0 comments

Economics Puzzle Leads to a New Understanding of a Fundamental Law of Physics

https://www.caltech.edu/about/news/economics-puzzle-leads-to-a-new-understanding-of-a-fundamental...
2•geox•38m ago•0 comments

Switzerland's Extraordinary Medieval Library

https://www.bbc.com/travel/article/20260202-inside-switzerlands-extraordinary-medieval-library
2•bookmtn•38m ago•0 comments

A new comet was just discovered. Will it be visible in broad daylight?

https://phys.org/news/2026-02-comet-visible-broad-daylight.html
2•bookmtn•43m ago•0 comments

ESR: Comes the news that Anthropic has vibecoded a C compiler

https://twitter.com/esrtweet/status/2019562859978539342
1•tjr•45m ago•0 comments

Frisco residents divided over H-1B visas, 'Indian takeover' at council meeting

https://www.dallasnews.com/news/politics/2026/02/04/frisco-residents-divided-over-h-1b-visas-indi...
3•alephnerd•45m ago•1 comments

If CNN Covered Star Wars

https://www.youtube.com/watch?v=vArJg_SU4Lc
1•keepamovin•51m ago•1 comments

Show HN: I built the first tool to configure VPSs without commands

https://the-ultimate-tool-for-configuring-vps.wiar8.com/
2•Wiar8•54m ago•3 comments

AI agents from 4 labs predicting the Super Bowl via prediction market

https://agoramarket.ai/
1•kevinswint•59m ago•1 comments

EU bans infinite scroll and autoplay in TikTok case

https://twitter.com/HennaVirkkunen/status/2019730270279356658
6•miohtama•1h ago•4 comments

Benchmarking how well LLMs can play FizzBuzz

https://huggingface.co/spaces/venkatasg/fizzbuzz-bench
1•_venkatasg•1h ago•1 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
19•SerCe•1h ago•12 comments

Octave GTM MCP Server

https://docs.octavehq.com/mcp/overview
1•connor11528•1h ago•0 comments

Show HN: Portview what's on your ports (diagnostic-first, single binary, Linux)

https://github.com/Mapika/portview
3•Mapika•1h ago•0 comments

Voyager CEO says space data center cooling problem still needs to be solved

https://www.cnbc.com/2026/02/05/amazon-amzn-q4-earnings-report-2025.html
1•belter•1h ago•0 comments

Boilerplate Tax – Ranking popular programming languages by density

https://boyter.org/posts/boilerplate-tax-ranking-popular-languages-by-density/
1•nnx•1h ago•0 comments

Zen: A Browser You Can Love

https://joeblu.com/blog/2026_02_zen-a-browser-you-can-love/
1•joeblubaugh•1h ago•0 comments

My GPT-5.3-Codex Review: Full Autonomy Has Arrived

https://shumer.dev/gpt53-codex-review
2•gfortaine•1h ago•0 comments

Show HN: FastLog: 1.4 GB/s text file analyzer with AVX2 SIMD

https://github.com/AGDNoob/FastLog
2•AGDNoob•1h ago•1 comments

God said it (song lyrics) [pdf]

https://www.lpmbc.org/UserFiles/Ministries/AVoices/Docs/Lyrics/God_Said_It.pdf
1•marysminefnuf•1h ago•0 comments
Open in hackernews

Agentic QA – Open-source middleware to fuzz-test agents for loops

39•Saurabh_Kumar_•2mo ago
I built this because I watched my LangChain agent burn ~$50 in OpenAI credits overnight due to an infinite loop.

It's a middleware API that acts as a 'Flight Simulator'. You send it your agent's prompt, and it runs adversarial attacks (Red Teaming) to catch loops and PII leaks before deployment.

Code & Repo: https://github.com/Saurabh0377/agentic-qa-api Live Demo: https://agentic-qa-engine.onrender.com/docs

Would love feedback on other failure modes you've seen!

Comments

Saurabh_Kumar_•2mo ago
HN, OP here. I built this because I recently watched my LangChain agent burn through ~$50 of OpenAI credits overnight. It got stuck in a semantic infinite loop (repeating "I am checking..." over and over) which my basic max_iterations check didn't catch because the phrasing was slightly different each time. Realizing that "Pre-Flight" testing for agents is surprisingly hard, I built a small middleware API (FastAPI + LangChain) to automate this. What it does: It acts as an adversarial simulator. You send it your agent's system prompt, and it spins up a 'Red Team' LLM to attack it. Currently checks for: Infinite Loops: Semantic repetition detection. PII Leaks: Attempts social engineering ('URGENT AUDIT') to force the agent to leak fake PII, then checks if it gets blocked. Prompt Injection: Basic resistance checks. Tech Stack: Python, FastAPI, Supabase (for logs). It's open-source and I hosted a live instance on Render if you want to try curl it without installing: https://agentic-qa-api.onrender.com/docs Would love feedback on what other failure modes you've seen your agents fall into!
esafak•1mo ago
1. This is premature to share. I'm not going to pull in a dependency for something so trivial: https://github.com/Saurabh0377/agentic-qa-api/blob/main/main...

2. Keep the comments in English.

giancarlostoro•1mo ago
I had Claude Code losing its mind because of something outside of its control, one of the formatters used by Zed for Python kept messing with HTML templates, which are insanely sensitive to line breaks in some template specific code statements. Zed kept adding line breaks without reason other than some tool just did it. Claude kept trying to fix it, going to the extreme of using ed to force it, I watched it lose its mind till I asked "I think Zed is formatting the file every time you save?" turns out, yes, yes it was. It wasn't an issue when it used ed, but when Claude or I would change the file again, it would become an issue again.

I don't know what could have saved me, maybe .current_editor should be a file that your agents instructions.md file imports, and your editor updates it, to give Claude context about your tooling.

khannn•1mo ago
Couldn't even keep an em dash out of the title

BOOOOO

mikigraf•1mo ago
Almost thought you found my startup AgenticQA.eu