frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: Lowfat – pluggable CLI filter that saved 91.8% of my LLM tokens

https://github.com/zdk/lowfat
48•zdkaster•6h ago•36 comments

Show HN: Local-first fast CPU image to text for screenshots, PDFs, webpages

https://github.com/kouhxp/textsnap
10•mrkn1•4h ago•12 comments

Show HN: Fast Android File Manager that works

https://github.com/djanogly/fast-android-file-manager
2•jackjayd•3h ago•3 comments

Show HN: Mercek – A Desktop IDE for AWS ECS

https://www.mercek.dev/
60•utibeumanah•18h ago•26 comments

Show HN: Prela – Purely Algebraic Relation Combinators

https://github.com/remysucre/prela
70•remywang•4d ago•13 comments

Show HN: Altersend – File sharing without cloud

https://github.com/denislupookov/altersend
9•denisdev1•9h ago•4 comments

Show HN: Uruky (EU-based Kagi alternative) now has Image Search and URL Rewrites

https://uruky.com/?il=en
228•BrunoBernardino•1d ago•213 comments

Show HN: Edsger – A handwritten Clojure REPL for the reMarkable 2

https://handwritten.danieljanus.pl/2026-06-01-edsger.html
257•nathell•2d ago•34 comments

Show HN: I reverse-engineered the world maps of Test Drive III (1990 DOS game)

https://github.com/s-macke/Test-Drive-3-Maps
215•s-macke•5d ago•56 comments

Show HN: Hitoku Draft – Context aware local assistant

https://hitoku.me/draft/
19•lostathome•22h ago•5 comments

Show HN: NoiR Code – because QR sounds similar to "noir"

https://noir-code.suncake.xyz/
11•Sunkek•2d ago•5 comments

Show HN: Cost.dev (YC W21) – making agents cost-aware and cheaper to call

https://cost.dev/
33•akh•1d ago•18 comments

Show HN: Formally verified polygon intersection – Opus 4.8 oneshots, prev failed

https://github.com/schildep/verified-polygon-intersection
45•permute•17h ago•13 comments

Show HN: Papernews – self-hosted daily newspaper PDF for your reMarkable

https://github.com/marcj/papernews
10•bourbonproof•16h ago•3 comments

Show HN: I embedded 685M public texts in 32 minutes (on 8x A100, Rust, TensorRT)

https://github.com/Artain-AI/ignite-ms
6•ddayanov•1d ago•0 comments

Show HN: Eyeball

https://eyeball.rory.codes/
291•mrroryflint•3d ago•88 comments

Show HN: Nutrepedia – Nutrition info in 29 locales built with Clojure and Htmx

https://nutrepedia.com/en-us/
132•llovan•1d ago•29 comments

Show HN: Boxes.dev: ditch localhost; run Claude Code and Codex in the cloud

https://boxes.dev
92•nab•1d ago•68 comments

Show HN: CentProof – Local-first bank statement reconciliation for macOS

https://centproof.com
3•javamantraact•10h ago•0 comments

Show HN: Mnemo – local-first AI memory layer for any LLM (Rust, SQLite,petgraph)

https://github.com/zaydmulani09/mnemo
59•zaydmulani•1d ago•26 comments

Show HN: Lessons learned from running Claude Code swarms at scale

8•sermakarevich•11h ago•2 comments

Show HN: Using Haskell to play music on 3D printer motors (2020)

https://lucasoshiro.github.io/software-en/2020-07-31-music_gcode/
11•lucasoshiro•19h ago•2 comments

Show HN: Intencion – Product analytics that improves your AI agents continuously

https://intencion.io
5•sakuraiben•18h ago•0 comments

Show HN: Live breath detection and biofeedback from a phone microphone

https://github.com/shiihaa-app/shiihaa-breath-detection
64•felixzeller•2d ago•26 comments

Show HN: FFmpeg WebCLI – Full FFmpeg in Browser, Offline PWA, No Uploads(WASM)

https://github.com/tejaswigowda/ffmpeg-webCLI
81•tejaswigowda•19h ago•25 comments

Show HN: Rscrypto, pure-Rust crypto with industry leading public benches

https://github.com/loadingalias/rscrypto
33•LoadingALIAS•1d ago•14 comments

Show HN: Digger Solo – Local AI File Explorer

https://solo.digger.lol
5•sean_pedersen•20h ago•0 comments

Show HN: Ideogram 4.0 – open-weight 9.3B text-to-image model

https://github.com/ideogram-oss/ideogram4
46•pigcat•1d ago•10 comments

Show HN: Bio Glyph – Turn Your Face into a One-Line Drawing

https://bio.bairui.dev/
21•subairui•1d ago•17 comments

Show HN: ControllerTest-test gamepads,stick drift and polling rate by browser

https://controllertestonline.com/
4•zylics•15h ago•0 comments
Open in hackernews

Show HN: I benchmarked LLM agents on fixing real-world security vulnerabilities

https://giovannigatti.github.io/cve-bench/
4•ggattip•8h ago
I built a benchmark with 20 real CVEs across 18 Python projects (Pillow, GitPython, yt-dlp, urllib3, etc). I've run it over 5 LLM agents (3 OpenAI, 2 poolside) and 3 different prompts (full advisory, locate, diagnose) with a total of 300 runs. The agents are tasked to fix security vulnerabilities in a sandboxed environment and they are scored against a hidden security tests from the maintainer's own fix.

Best solve rate was 50%. On the other 50%, some fixes are sometimes coherent and pass all regression tests, but vulnerability still present.

The main differentiator I found between models is cost: gpt-5.5 at 12× more expensive than gpt-5.4-mini while producing statistically similar results. Within-family performance gaps are small, which points out the difference is likely due to model training data. I also did a power analysis and the task count needed to detect a meaningful within-family edge at ~700.

Full write-up: https://giovannigatti.github.io/cve-bench

Code: https://github.com/GiovanniGatti/cve-bench

Comments

KyleTheDev•1h ago
"The goal isn’t to rank models, but to understand how they fail."

The goal isn't to write an informative blog post describing what you learned, but to generate slop and expect other folks to read it.

I really wish people would stop doing this. I love reading about your side projects and all of the cool things you're doing. But, it just feels insulting to open up something that's so obviously completely AI generated. If you aren't willing to write it in your own voice, why would it be worth reading?

sdsdffsddfs•44m ago
You know the meme where a concise sentence is translated by an LLM into a loquacious formal email which is then again summarized to a concise statement by another LLM on the receiving end?

I believe that's what we need to do here. People have some interesting information to share, but they don't care about penmanship and that's not just being lazy. It takes a lot of time to produce a nice post. I cannot guarantee the author used an LLM but there sure is a suspicious amount of em-dashes.

Anyway, there are still some interesting data points so I'd recommend to run the website through an LLM to get a nice summary if the prominent TL;DR is too short for you. Times are a-changing.

KyleTheDev•32m ago
I agree somewhat. My issue is primarily that, without the author actually penning the post themselves, we have little to no evidence that they've actually done anything. Maybe the data is all AI generated or hallucinated, maybe the validations weren't thorough. I could determine all of these things myself, via rigorous review of the blog post. But at that point, I'm just doing the research myself, of what use is the post?

For work communications, I agree with you. There's an inherent accountability there. If you send me AI slop, and something goes terribly wrong, you'll be held accountable for the slop. Here, the slop is just noise that prevents us from finding the truly interesting posts.