frontpage.

Show HN: Probus, AI vuln scanner (PRs merged in Vercel AI SDK, n8n, LangGraph)

1•etairl•1h ago

Hi HN, I've been running this on my own dependency tree for the past few months. Probus is a vulnerability scanner that uses three agents. One picks the files worth deep-scanning. One writes raw findings. The third reads the code on its own and rejects any finding that doesn't have a real attack vector. While building it I pointed it at projects I use day to day. Bugs that came out of this and got reported as PRs:

n8n: password-reset JWTs being logged at debug level (n8n-io/n8n#29405) Vercel AI SDK: role: "system" injection in createAgentUIStream, a runtime schema bypass in ToolLoopAgent, and a prototype-property collision in getMediaTypeFromUrl (vercel/ai#14749, #14750 merged, #14751 merged) LangGraph.js: NoSQL injection in MongoDBSaver via unvalidated thread_id / checkpoint_ns / checkpoint_id types (langchain-ai/langgraphjs#2353) browser-use: path traversal in remote-fetched templates.json fields (browser-use/browser-use#4777) Haystack: SSRF and arbitrary file read via unrestricted OpenAPI $ref resolution, path traversal in the image converter, and unbounded HTTP body reads in LinkContentFetcher (deepset-ai/haystack#11226, #11228, #11229)

The false positive rate got low enough that I'd rather have other people running it than keep it private, so it's now public under Apache 2.0. How it works:

Analyst (1 LLM call): reads the repo and picks 50 to 500 files to deep-scan based on entry points, third-party surface, and dangerous sinks. Researcher (per file): walks call chains and writes raw findings. QA (per file): re-reads the code against each claim with no access to the researcher's reasoning, and rejects anything that doesn't have a real attack vector. Keeping the QA agent isolated from the researcher is what got noise under control. If it sees the researcher's reasoning, it just agrees with it.

Each agent runs in its own query() session through the Claude Agent SDK with a filesystem sandbox scoped to the target repo. Cost is tuned for open models. About $0.50 per file with Qwen 3.6 plus DeepSeek v4 Pro on OpenRouter. OpenAI is around 2.5x that. Anthropic is around 10x. npm install -g probus probus scan ./my-app Things I'd like feedback on:

The QA prompt took the most iteration. Happy to walk through it if anyone is working on similar verifier-agent patterns. I want to publish a public benchmark against a vulhub-style corpus. Suggestions on which repos to run it against would be helpful. The analyst step is a single LLM call right now. On large monorepos it sometimes misses things. Thinking about a hierarchical version.

https://github.com/etairl/Probus

Ask HN: Who's adopting Apache Iceberg in 2026?

CRDTs for Free

Vine video-sharing app is back – and battling AI slop

Show HN: Give Feedback to Get Feedback Subreddit

Show HN: New Benchmark from SWE-bench team is 0% solved

Show HN: HF viewer – visualize any Hugging Face model

Show HN: A Mutating Webhook to automatically strip PII from K8s logs

Show HN: MCP server that lets Claude query your Google Calendar

Show HN: Codeberg (Forgejo) CLI

An AI-native approach to personalized marketing

London Is Still Paying Rent to the Queen on a Property Leased in 1211

HAM Radio Is Not Just for Talking

Agents for Financial Services and Insurance

ESP32 Hosts Solarpunk Message Board

I tried making my own AG Grid, and it took 9 months

I built a tagging system where you don't have to remember your tags (no AI)

AI systems are about to start building themselves

Show HN: Airbyte Agents – context for agents across multiple data sources

Postgres – Asynchronous Commits

AI inference infrastructure built on small and nano models

The ultimate guide to RL environments: building and scaling them in the LLM era

It's official: Utah is the U.S. state closest to banning VPNs

Show HN: Claude-smart – Make Claude Code self-improve from every session

LLM-test-kit – Test consistency, latency, cost and behavior of LLM apps

Notes from Optimizing CPU-Bound Go Hot Paths

Show HN: I used accounting controls to build a governed AI coding tool

I love AI assistants but objectively they're still terrible. (A Lefos review)

Show HN: Memopt: Open-source GPU memory fabric for AI infrastructure

HN: AquaLens – Real-time NOAA and GEBCO ocean dashboard for vessel operations

AI won't speed up software delivery – nothing has