frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Weird System Prompt Artefacts

https://blog.nilenso.com/blog/2026/02/12/weird-system-prompt-artefacts/
1•sriharis•1m ago•0 comments

Making WebAssembly a first-class language on the Web

https://hacks.mozilla.org/2026/02/making-webassembly-a-first-class-language-on-the-web/
1•mikece•1m ago•0 comments

ExposinDisrupting the Gridtide Global Cyber Espionage Campaign

https://cloud.google.com/blog/topics/threat-intelligence/disrupting-gridtide-global-espionage-cam...
1•jonah•1m ago•0 comments

A Decade of Docker Containers

https://anil.recoil.org/papers/2026-decade-docker
1•mariuz•1m ago•0 comments

Will vibe coding end like the maker movement?

https://read.technically.dev/p/vibe-coding-and-the-maker-movement
1•itunpredictable•1m ago•0 comments

Feature Platforms: The Underrated Infrastructure Layer Behind Fast ML Teams

https://blog.nilenso.com/blog/2026/02/19/feature-platforms/
1•sriharis•2m ago•0 comments

From Tahoe bugs to app review delays, the Apple developer experience is fraying

https://keydiscussions.com/2026/02/26/from-tahoe-bugs-to-long-app-review-wait-times-even-app-proc...
1•spenvo•2m ago•0 comments

The Government Just Made It Harder to See What Spy Tech It Buys

https://www.404media.co/the-government-just-made-it-harder-to-see-what-spy-tech-it-buys/
1•cdrnsf•2m ago•0 comments

iRobot Went Bankrupt. Its Product Scores Explain Why

https://www.criticaster.com/blog/irobot-bankrupt-scores-explain-why
1•gghootch•2m ago•0 comments

The Agentic Simul: What 500 PRs in two months taught me

https://tobeva.com/articles/simul/
1•pbw•2m ago•0 comments

Schrödinger Color Theory Completed After 100 Years

https://www.sciencedaily.com/releases/2026/02/260222092302.htm
1•EventH-•3m ago•0 comments

Linux Heterogeneous Memory Management (HMM)

https://www.kernel.org/doc/html/latest/mm/hmm.html
3•teleforce•3m ago•0 comments

CVE-2026-2006 – PostgreSQL Out-of-cycle release

https://wiki.postgresql.org/wiki/2026-02_Regression_Fixes
1•krembo•4m ago•0 comments

I don't need AI to build me a new app. I need it to make Jira bearable

1•niel_hu•4m ago•0 comments

Show HN: Cifer, zero-key custody using threshold cryptography

https://cifer-security.com
1•mikflex•4m ago•0 comments

British Citizenship Applications by US Nationals Hit Record High

https://www.bloomberg.com/news/articles/2026-02-26/british-citizenship-applications-by-us-nationa...
2•helsinkiandrew•4m ago•0 comments

A New Era of Databases: Lakebase

https://www.databricks.com/blog/what-is-a-lakebase
1•mastabadtomm•5m ago•0 comments

Show HN: NotBuiltYet– Open-source library of civilisation problems worth solving

https://shivankar-madaan.github.io/notbuiltyet/
2•mrxlimitless•5m ago•0 comments

Show HN: Ryvos – Autonomous AI assistant in Rust(15MB RAM,50 tools,16 providers)

https://ryvos.dev
1•aayush-mishraaa•6m ago•0 comments

Nano Banana 2: Google's latest AI image generation model

https://blog.google/innovation-and-ai/technology/ai/nano-banana-2/
4•davidbarker•6m ago•1 comments

EHR API Explorer

https://explorer.usecobalt.com/
1•bryanmillstein•7m ago•1 comments

Matrix Inverse Roots with Fixed-Budget GEMM Kernels

https://jiha-kim.github.io/posts/fast-matrix-inverse-roots/
1•ibobev•7m ago•0 comments

Partial Truth vs. Explicit Failure: Designing Honest System Responses

https://www.sandordargo.com/blog/2026/02/25/partial-truth-vs-explicit-failure
1•ibobev•8m ago•0 comments

Memory or mood? Probiotic capsules and powders may affect the brain differently

https://medicalxpress.com/news/2026-02-memory-mood-probiotic-capsules-powders.html
1•PaulHoule•9m ago•0 comments

Linux Foundation's report reveals contributing to open source offers a 2x-5x ROI

https://thenewstack.io/roi-open-source-contribution/
1•CrankyBear•9m ago•0 comments

Speculations Concerning the First Ultraintelligent Machine (1964) [pdf]

https://languagelog.ldc.upenn.edu/myl/Good1964.pdf
3•ZeljkoS•9m ago•0 comments

Rule of Three (Computer Programming)

https://en.wikipedia.org/wiki/Rule_of_three_(computer_programming)
3•thunderbong•10m ago•0 comments

Introduction to Data-Centric Query Compilation

https://duckul.us/blog/data-centric-query-compilation
1•duckulus•10m ago•1 comments

I started a software research company

https://notes.eatonphil.com/2026-02-25-i-started-a-company.html
1•ibobev•10m ago•0 comments

Show HN: I built a minimal distributed tracer from scratch to understand better

https://github.com/td-02/tracelm
1•taeshdas•11m ago•1 comments
Open in hackernews

Show HN: Coding agents find the right GPU bottleneck 70% of the time, fix it 30%

https://ayushnangia.github.io/iso-bench-website/
2•ayushnangia16•1h ago
One of the authors. Some things that surprised us while running these experiments:

The tasks are pulled from real merged PRs in vLLM and SGLang, so there's a known-good human solution for each one. Agents get the full codebase, the issue description, and a test harness. Pretty generous setup.

What we didn't expect: the agents are genuinely good at diagnosing the problem. They read the code, find the bottleneck, describe the right fix. But then the generated code has subtle bugs. Off-by-one in kernel indexing, wrong tensor shapes, missing synchronization barriers. The kind of stuff that passes a code review at first glance but segfaults under load.

The other weird result: agent rankings completely invert between codebases. Claude Code is the best performer on vLLM (46%) but the worst on SGLang (27%). TRAE with GPT-5 is the opposite pattern. Same underlying models, different agent scaffolding. It suggests the scaffolding around the model matters at least as much as the model itself.

We also tried three open-source models. None produced a single working optimization. One of them (MiniMax-M2.1) got stuck in a loop printing "I need to actually use the tools now" 2,412 times without ever making a tool call.

The benchmark, all agent transcripts, and evaluation code are open: https://ayushnangia.github.io/iso-bench-website/

Curious what others think about the scaffolding result in particular feels underexplored.

Comments

PaulHoule•1h ago
Those "Lucky Wins" are a big part of the LLM success or "looks like success" story.

One reason the teams I was on did not invent models that good in the 2010s was that we didn't want to give them credit for Lucky Wins.