news newest ask show jobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Constraint Decay: The Fragility of LLM Agents in Back End Code Generation

https://arxiv.org/abs/2605.06445

13•wek•2h ago

Comments

maxbond•20m ago

Reminds me of the recent paper about delegating document editing tasks to LLMs across different disciplines [1]. That paper found that programming was the only discipline most LLMs can perform long horizon tasks on without accumulating errors & corrupting the document.

I've only read the abstract of this one so far but it seems like this paper has zoomed in on programming with greater fidelity and shown a similar phenomenon. But not about long tasks horizons, more like "long style horizons" of larger sets of structural constraints.

[1] https://arxiv.org/abs/2604.15597

Discussion: https://news.ycombinator.com/item?id=48073246

jdlshore•17m ago

“Our systematic study exposes a phenomenon of constraint decay in LLM-based coding agents. While current models excel at unconstrained generation, their performance drops when forced to navigate explicit architectural rules. For end-users, this dichotomy implies that agents are reliable for rapid prototyping but remain unreliable for production-grade backend development.”

One major weakness of this study is that they didn’t fully test frontier models for cost reasons, so the specific performance results should be taken with a grain of salt. But the overall conclusion that models degrade when both behavior and architecture must be correct is interesting, and something to keep an eye on.

gkfasdfasdf•14m ago

Odd they used GPT-5.2 and not GPT-5.2-codex. i.e. the one optimized for coding agent tasks.

Microsoft's 6502 BASIC is now Open Source (2025)

https://opensource.microsoft.com/blog/2025/09/03/microsoft-open-source-historic-6502-basic/

42•GTP•1h ago•9 comments

Mastering Dyalog APL

https://mastering.dyalog.com/README.html

67•tosh•3h ago•9 comments

Childhood Computing

https://susam.net/childhood-computing.html

63•blenderob•2h ago•32 comments

DeepSeek reasonix, DeepSeek native coding agent with high caching and low cost

https://esengine.github.io/DeepSeek-Reasonix/

47•Alifatisk•2h ago•25 comments

I spent 50 hours drawing a line graph

https://www.dougmacdowell.com/50-hours-to-draw-some-lines.html

202•dougdude3339•3d ago•31 comments

Microsoft open-sources "the earliest DOS source code discovered to date"

https://arstechnica.com/gadgets/2026/04/microsoft-open-sources-the-earliest-dos-source-code-disco...

347•DamnInteresting•13h ago•107 comments

I keep bouncing off the Scheme language

https://www.sicpers.info/2026/05/i-keep-bouncing-off-the-scheme-language/

49•ingve•2d ago•14 comments

Constraint Decay: The Fragility of LLM Agents in Back End Code Generation

https://arxiv.org/abs/2605.06445

13•wek•2h ago•3 comments

The seed oil panic is hurting my cardiac patients

https://www.statnews.com/2026/05/22/seed-oils-healthy-fats-tallow-fact-check-cardiac-health/

56•randycupertino•39m ago•19 comments

Greg Brockman: Inside the 72 Hours That Almost Killed OpenAI

https://fs.blog/knowledge-project-podcast/greg-brockman/

125•prakashqwerty•6h ago•92 comments

Scammers are abusing an internal Microsoft account to send spam links

https://techcrunch.com/2026/05/21/scammers-are-abusing-an-internal-microsoft-account-to-send-spam/

202•spike021•14h ago•107 comments

Wake up! 16b

https://hellmood.111mb.de/wake_up_16b_writeup.html

323•MaximilianEmel•14h ago•24 comments

Swap tables, flash-friendly swap, swap_ops, and more

https://lwn.net/SubscriberLink/1072657/394b87abd7cc215e/

38•mkesper•4d ago•0 comments

Why is Vivado 2026.1 dropping Linux support for free tier?

https://adaptivesupport.amd.com/s/question/0D5Pd00001YQLdMKAX/why-is-vivado-20261-dropping-linux-...

248•zdw•10h ago•122 comments

Silk: Open-source cooperative fiber scheduler

https://github.com/ClickHouse/silk

70•animetyan•3d ago•9 comments

The C64 Dead Test Font

https://www.masswerk.at/nowgobang/2026/c64-dead-test-font

87•masswerk•11h ago•16 comments

Perceptual Image Codec: What Matters in Practical Learned Image Compression

https://apple.github.io/ml-pico/

12•ksec•3h ago•1 comments

Predicting the 2026 Bristol Bay and Kodiak Salmon Runs

https://www.salmonfinder.com/2026/05/13/bristol-bay-kodiak-predictions-2026

3•mooreds•2d ago•0 comments

Converting an Integer to a Decimal String in Under Two Nanoseconds

https://onlinelibrary.wiley.com/doi/10.1002/spe.70079

77•mpweiher•4d ago•39 comments

Alexander Grothendieck Revolutionized 20th-Century Mathematics

https://www.quantamagazine.org/how-alexander-grothendieck-revolutionized-20th-century-mathematics...

92•anujbans•11h ago•21 comments

Time to talk about my writerdeck

https://veronicaexplains.net/my-first-writerdeck/

416•hggh•20h ago•243 comments

On The <dl> (2021)

https://benmyers.dev/blog/on-the-dl/

414•ravenical•1d ago•121 comments

Show HN: Git-based front-end interface for Hugo

https://github.com/arashthr/hugo-flow

19•arashThr•3d ago•6 comments

My two-part desk setup (2025)

https://arslan.io/2025/11/18/my-two-part-desk-setup/

317•James72689•3d ago•191 comments

The Art of Money Getting

https://kk.org/cooltools/book-freak-210-the-art-of-money-getting/

338•dxs•1d ago•180 comments

My I3-Emacs Integration

https://khz.ac/software/i3-integration.html

94•nosolace•15h ago•39 comments

Sales and Dungeons: Thermal printer TTRPG utility

https://sales-and-dungeons.app/

115•hyperific•2d ago•36 comments

Key, in sight – A guide, of sorts, to keyboard customization

https://aresluna.org/key-in-sight/

25•anotherevan•4d ago•10 comments

Green card seekers must leave U.S. to apply, Trump administration says

https://www.nytimes.com/2026/05/22/us/politics/green-card-changes-trump.html

979•tlhunter•1d ago•1631 comments

Kindle loyalists scramble as Amazon turns page on old e-readers

https://www.reuters.com/business/retail-consumer/kindle-loyalists-scramble-amazon-turns-page-old-...

204•cf100clunk•4d ago•265 comments