frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

What 23K vulnerabilities reveal about audit report quality in Web3

https://colab.research.google.com/drive/1Wp4yyEmXYjHATak7Bmy2lf6DNgyjlxgI?usp=sharing
2•zaevlad•1h ago

Comments

zaevlad•1h ago
Over the past year I built and analyzed a dataset of 23K+ vulnerabilities extracted from smart contract audit reports published between 2023 and 2025. Sources include private auditors, audit firms, and competitive platforms such as Code4rena and Sherlock.

The dataset was cleaned before analysis: 99% of Informational-severity findings and ~40% of Low-severity were removed, as they consistently lacked sufficient detail to be informative.

The goal was to quantify report quality — not just flag vulnerabilities, but measure how well each one is documented. This became the foundation for a RAG-based audit assistant I've been building, where data quality has an outsized effect on output quality.

Scoring methodology:

Each finding was scored on three primary dimensions — description depth, remediation quality, and presence of a PoC. PoC carried the highest weight, as it is the most reliable signal of a useful report. Solidity snippets and severity level contributed additional points. Raw scores (0–15) were log-normalized to 0–1 to prevent score concentration at the top.

Key findings:

— Total findings analyzed: 23,625 — Mean score: 0.32 | Median: 0.27 — Distribution is multimodal with three distinct quality tiers (~0.05, ~0.25, ~0.60) — ~25% of findings score above 0.51 — these form the high-quality tier ("golden data fund") — All three normality tests confirm the distribution is significantly non-Gaussian

Most counterintuitive result: Critical-severity bugs score lower on average (0.33) than High-severity ones (0.53). Critical findings tend to be reported as brief alerts without PoC — the severity speaks for itself, so the write-up gets less attention. High findings, by contrast, typically include more thorough documentation. This is a problem: the bugs most likely to cause catastrophic losses are often the least well-documented.

What this means in practice:

The three-peak distribution reflects real behavioral patterns in how auditors write reports. The first cluster (scores ~0.05) represents minimal one-liner findings with no context. The second (~0.25) covers standard reports with a description but no PoC. The third (~0.60) is the minority that includes everything: a clear description, remediation steps, and working exploit code. Only this last group is genuinely useful for both AI training and human review.

For the broader ecosystem, the takeaway is uncomfortable: the current standard of audit reporting leaves most findings underexplained. A well-documented bug with a PoC can be understood, reproduced, and fixed in hours. A vague one-liner can stay misunderstood for weeks — or get silently ignored in the next audit cycle.

If you want to see the full distribution charts and statistics for yourself, I put together an interactive notebook with all the visualizations:

https://colab.research.google.com/drive/1Wp4yyEmXYjHATak7Bmy...

Open to questions on methodology or dataset composition.

Show HN: Isola – Open-source sandboxing on Kubernetes

https://github.com/isola-run/isola
1•benldrmn•30s ago•0 comments

Ultra-fast 1024-bit prime generator via Hilbert-Pólya spectral law

https://github.com/model-vpr/ultrafast-spectral-primes
1•vpr-research•53s ago•0 comments

MS-DOS TUIs from the late 80s and early 90s

https://twitter.com/hey_zilla/status/2046234255244591486
1•3stripe•58s ago•0 comments

They Built a Legendary Privacy Tool. Now They're Sworn Enemies

https://www.wired.com/story/they-built-privacy-tool-grapheneos-now-sworn-enemies/
1•xrd•1m ago•0 comments

How to Learn Programming in 2026

https://blog.alcazarsec.com/tech/posts/how-to-learn-programming-2026
1•alcazar•2m ago•0 comments

Trump Considers Bailing Out His Family's Major Business Partner

https://newrepublic.com/post/209273/trump-considers-bailing-out-uae
1•tcp_handshaker•3m ago•0 comments

Why Crystal, 10 Years Later: Performance and Joy

https://serdardogruyol.com/why-crystal-10-years-later-performance-and-joy
2•multiplegeorges•3m ago•0 comments

Show HN: Turn Any Webpage into a Game

1•dassh•3m ago•0 comments

AI Agent Memory Explained in 3 Levels of Difficulty

https://machinelearningmastery.com/ai-agent-memory-explained-in-3-levels-of-difficulty/
2•eigenBasis•4m ago•0 comments

Ratsissimo – AI-Powered Singing Rat Trap

https://shitposting.ai/ratsissimo/#arias
1•FigurativeVoid•4m ago•0 comments

What do flying cars and AI innovation have in common?

https://medium.com/@groundtruthpost/everyone-is-building-flying-cars-60451bf36a91
1•thefeedbackloop•5m ago•0 comments

Worse Is Better

https://en.wikipedia.org/wiki/Worse_is_better
1•jermaustin1•5m ago•0 comments

The Bitter Lesson of Agentic Coding

https://agent-hypervisor.ai/posts/bitter-lesson-of-agentic-coding/
1•peterzat•6m ago•0 comments

The sonic anatomy of a double tap strike

https://earshotngo.substack.com/p/the-sonic-anatomy-of-a-double-tap
1•moxifly7•7m ago•0 comments

I froze a TCP connection for 10 minutes to migrate a live server

https://github.com/DongSunchao/libccmc
1•sunchaodong•7m ago•1 comments

The United States Is Repeating Its Silicon Mistake with Gallium Nitride

https://warontherocks.com/cogs-of-war/the-united-states-is-repeating-its-silicon-mistake-with-gal...
1•crescit_eundo•7m ago•0 comments

Wait Is Over – Coreboot on the AMD StarBook – Star Labs

https://it.starlabs.systems/blogs/news/coreboot-on-the-amd-starbook-finally
1•g-b-r•8m ago•1 comments

I'm Sorry, Dave. I'm Afraid I Can't De-Escalate: On (AI) Wargaming, Nuclear War

https://warontherocks.com/im-sorry-dave-im-afraid-i-cant-de-escalate-on-ai-wargaming-and-nuclear-...
2•crescit_eundo•8m ago•0 comments

GridMove for macOS: Move or snap windows by dragging from anywhere inside them

https://github.com/mirtlecn/GridMoveForMac/
1•mirtle•8m ago•0 comments

Nobel Lecture: On the possibility of progress (2019)

https://paulromer.net/prize/
1•ipnon•8m ago•0 comments

We OCR'ed 30k papers using Codex, open OCR models and Jobs

https://huggingface.co/blog/nielsr/ocr-papers-jobs
1•speckx•9m ago•0 comments

Consider the Chairmaker

https://ben.stolovitz.com/posts/consider-the-chairmaker/
1•citelao•9m ago•1 comments

The most underrated distribution channel in SaaS is hiding in browser toolbar

https://www.indiehackers.com/post/the-most-underrated-distribution-channel-in-saas-is-hiding-in-y...
1•max_flowly_run•9m ago•0 comments

Turing Award Winner - Mike Stonebraker: Postgres, Disagreeing with Google [video]

https://www.youtube.com/watch?v=YPObBOwIrHk
2•abkolan•9m ago•0 comments

Show HN: A stateless search proxy using Cloudflare Workers

https://github.com/logotam-app/stateless-search-proxy
1•vovanidze•10m ago•0 comments

The Timelessness of TUIs

https://xit-vcs.github.io/xitlog/the-timelessness-of-tuis.html
2•xeubie•11m ago•0 comments

Websites break California privacy law at 'industrial scale,' survey finds

https://calmatters.org/economy/technology/2026/04/data-privacy-opt-outs/
1•cdrnsf•12m ago•0 comments

Anthropic takes $5B from Amazon and pledges $100B in cloud spending in return

https://techcrunch.com/2026/04/20/anthropic-takes-5b-from-amazon-and-pledges-100b-in-cloud-spendi...
4•Brajeshwar•13m ago•0 comments

AI Tool Rips Off Open Source Software Without Violating Copyright

https://www.404media.co/this-ai-tool-rips-off-open-source-software-without-violating-copyright/
1•cdrnsf•14m ago•0 comments

Adobe Unveils Agents for Businesses Amid Threat of AI Disruption

https://www.wsj.com/cio-journal/adobe-unveils-agents-for-businesses-amid-threat-of-ai-disruption-...
2•JumpCrisscross•14m ago•0 comments