frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

CVE-Bench: testing LLM agents on real-world vulnerability patches

https://giovannigatti.github.io/cve-bench/
4•logickkk1•1h ago

Comments

david_shaw•48m ago
The problem with Mythos and Glasswing related hype is that finding vulnerabilities isn't the problem for most organizations. It's great that Mythos and similar models can find vulnerabilities that remained undetected (and hopefully unexploited) for years. That's valuable, especially in open source projects, but it's never been the real challenge for software companies.

The real problem is balancing the need to fix vulnerabilities with the mandate of shipping new products and features. At every organization I've worked for or with, this has been the natural friction point. That's good: Product should make customers happy, and Security should keep the customers and their data safe.

Ultimately, the whole business should share these goals: everyone should strive for a resilient, useful product shipped quickly that delights customers. Easier said than done, but the friction should be tactical ("how do we spend engineering resources?") rather than strategic ("are security fixes important? do we care?").

Which is why I'm much more interested in automated (or semi-automated) PRs to actually fix discovered vulnerabilities rather than just identify them. But, as this project implies, it's not always that simple. It's easy to fix vulnerabilities if you don't care about breaking other functionality.

In my opinion, it's currently still necessary to have a human developer in the loop to make sure functionality in product is maintained, and potentially security in the loop to make sure the vulnerability is actually fixed and not just obfuscated.

Once this technology is sufficiently advanced -- and I think we're getting close -- my hope is that developer and security time will be spent thinking about resilient software design and architecture, not code-level vulnerabilities.

We'll see where it goes.

SQLite is all you need for durable workflows

https://obeli.sk/blog/sqlite-is-all-you-need-for-durable-workflows/
171•tomasol•2h ago•83 comments

The California State Assembly Has Passed the 'Protect Our Games Act'

https://www.invenglobal.com/articles/22330/stop-killing-games-movement-gains-momentum-california-...
37•TechTechTech•35m ago•15 comments

The dead economy theory

https://www.owenmcgrann.com/p/the-dead-economy-theory
384•WillDaSilva•4h ago•527 comments

Notes from the Mistral AI Now Summit in Paris

https://koenvangilst.nl/lab/mistral-ai-now-summit
244•vnglst•4h ago•59 comments

On Rendering Diffs

https://pierre.computer/writing/on-rendering-diffs
57•amadeus•1h ago•14 comments

Bijou64: A variable-length integer encoding

https://www.inkandswitch.com/tangents/bijou64/
178•justinweiss•5h ago•66 comments

It's hard to justify buying a Framework 12

https://www.jeffgeerling.com/blog/2026/its-hard-to-justify-framework-12/
144•watermelon0•5h ago•247 comments

Shift will clean homes for free to train future robots

https://www.theverge.com/ai-artificial-intelligence/939765/ai-training-data-startup-shift-free-cl...
14•evilsimon•1h ago•22 comments

Liquid AI reveals 8B-A1B MoE trained on 38T

https://www.liquid.ai/blog/lfm2-5-8b-a1b
75•simjnd•4h ago•18 comments

Rothko for your current weather conditions

https://rothko.joonas.wtf/
64•jxmorris12•1h ago•7 comments

GTA 6 Developers Unionize

https://rockstarintel.com/gta-6-developers-announce-rockstar-games-union/
447•AndrewKemendo•4h ago•278 comments

Show HN: TV Explorer. Adding advanced UI to free online TV

https://tvexplorer.live
58•dtagames•3h ago•9 comments

Letter from the Duke of Wellington to the British Foreign Office (1809)

https://wellsoc.org/society-member-pages/anecdotes-of-wellington/
27•backuprestore•2h ago•3 comments

Is AI causing a repeat of frontend’s lost decade?

https://mastrojs.github.io/blog/2026-05-23-is-AI-causing-a-repeat-of-frontends-lost-decade/
217•xyzal•9h ago•197 comments

CAPTCHAs can still detect AI agents

https://research.roundtable.ai/captchas-detect-ai/
53•timshell•4h ago•34 comments

We should be more tired than the model

https://vickiboykis.com/2026/05/28/we-should-be-more-tired-than-the-model/
127•tosh•8h ago•105 comments

Robinhood now lets your AI agents trade stocks

https://techcrunch.com/2026/05/27/robinhood-now-lets-your-ai-agents-trade-stocks/
61•wapasta•2h ago•109 comments

High Density Living, 2000 Years Ago: Inside the Roman Apartment Building

https://commonedge.org/high-density-living-2000-years-ago-inside-the-roman-apartment-building/
132•surprisetalk•8h ago•49 comments

Show HN: Tiny-vLLM – high performance LLM inference engine in C++ and CUDA

https://github.com/jmaczan/tiny-vllm
7•yu3zhou4•53m ago•0 comments

I am retiring from tech to live offline

https://openpath.quest/2026/i-am-retiring-from-tech-to-live-offline/
634•PinkG•5h ago•439 comments

Local Git remotes

https://cblgh.org/posts/local-git-remotes/
73•surprisetalk•7h ago•59 comments

CVE-Bench: testing LLM agents on real-world vulnerability patches

https://giovannigatti.github.io/cve-bench/
4•logickkk1•1h ago•1 comments

Cedana (YC S23) Is Hiring

https://www.ycombinator.com/companies/cedana/jobs/d1vYocG-forward-deployed-engineer-ai-hpc
1•neelm•8h ago

Someone used my open source project to phish people

https://andrej.sh/posts/phishing-through-my-open-source-project
72•andrejsshell•7h ago•42 comments

Expertise in the age of AI

https://www.moderndescartes.com/essays/ai_and_expertise/
84•brilee•6h ago•83 comments

Real-time LLM Inference on Standard GPUs: 3k tokens/s per request

https://blog.kog.ai/real-time-llm-inference-on-standard-gpus-3-000-tokens-s-per-request/
186•NicoConstant•10h ago•84 comments

ATLAS: Autoformalized Textbook Library At Scale

https://github.com/facebookresearch/atlas-lean
24•vrm•1d ago•4 comments

AI will be used to estimate age of asylum seekers from next year

https://www.bbc.co.uk/news/articles/ce3pe36qe7ro
30•vylorn•2h ago•23 comments

Durable execution, the hard way

https://github.com/hatchet-dev/durable-execution-the-hard-way
45•abelanger•1d ago•3 comments

Microsoft 0-day feud escalates as researcher threatens another exploit dump

https://www.theregister.com/security/2026/05/28/microsoft-0-day-feud-escalates-as-researcher-thre...
17•Cider9986•54m ago•3 comments