frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

What the Success of Coding Agents Teaches Us about AI Systems in General

https://softwarefordays.com/post/software-is-mostly-all-you-need/
13•jbmilgrom•2h ago

Comments

wrs•1h ago
In other words, a higher-level JIT compiler, meaning it still dynamically generates code based on runtime observations, but the code is in a higher-level language than assembly, and the observations are of a higher-level context than just runtime data types.
verdverm•1h ago
Lost me at the claim AI is good at judgement making, this is the exact opposite of my experience, they make both good and bad decisions with reliability
mvc•33m ago
I think it makes better decisions than me provided I give it enough high-level direction and context.

Sometimes I give it __too much__ direction and it finds the solution I had in mind but not the best.

I'm not into it enough that I'm formally running different personas against each other in a co-operative system but I kind of informally do that.

rybosworld•15m ago
I think that's also true of people but we are kinder to each other and ourselves when judgement is bad.

How many times have you been in a conversation where you asked the wrong question or stated the wrong thing because you either weren't 100% listening (no one is), or you forgot, or you didn't connect the same dots that others did?

Terr_•7m ago
Treating humans differently makes sense because the "badness" of a judgement isn't just the correctness of an outcome, but also the nature of the process that created it, and humans are a different process.
2001zhaozhao•1h ago
> Code is the policy, deployment is the episode, and the bug report is the reward signal

This is a great quote. I think it makes a ton of sense to view a sufficiently-cheap-and-automated agentic SWE system as a machine learning system rather than traditional coding.

* Perhaps the key to transparent/interpretable ML is to just replace the ML model with AI-coded traditional software and decision trees. This way it's still fully autonomously trained but you can easily look at the code to see what is going on.

* I also wonder whether you can use fully-automated agentic SWE/data science in adversarial use-cases where you traditionally have to use ML, such as online moderation. You could set a clear goal to cut down on any undesired content while minimizing false-positives, and the agent would be able to create a self-updating implementation that dynamically responds to adversarial changes. I'm most familiar with video game anti-cheat where I think something like this is very likely possible.

* Perhaps you can use a fully-automated SWE loop, constrained in some way, to develop game enemies and AI opponents which currently requires gruesome amounts of manual work to implement. Those are typically too complex to tackle using traditional ML and you can't naively use RL because the enemies are supposed to be immersive rather than being the best at playing the game by gaming the mechanics. Maybe with a player controller SDK and enough instructions (and live player feedback?), you can get an agent to make a programmatic game AI for you and automatically refine it to be better.

jbmilgrom•18m ago
> Perhaps the key to transparent/interpretable ML is to just replace the ML model with AI-coded traditional software and decision trees. This way it's still fully autonomously trained but you can easily look at the code to see what is going on.

For certain problems I think thats completely right. We still are not going to want that of course for classic ML domains like vision and now coding, etc. But for those domains where software substrate is appropriate, software has a huge interpretability and operability advantage over ML

isodev•43m ago
> Neural networks excel at judgment

I don’t think they do. I think they excel at outputting echoes of their training data that best fit (rhyme with, contextually) the prompt they were given. If you try using Claude with an obscure language or use case, you will notice that effect even more - it will keep pulling towards things it knows that aren’t at all what’s asked or “the best judgement” for what’s needed.

rybosworld•7m ago
Neural nets have been better at classifying handwriting (MNIST) than the best humans for a long time. This is what the author means by judgement.

They are super-human in their ability to classify.

PlayStation 2 Recompilation Project Is Absolutely Incredible

https://redgamingtech.com/playstation-2-recompilation-project-is-absolutely-incredible/
256•croes•6h ago•105 comments

Grid: Forever free, local-first, browser-based 3D printing/CNC/laser slicer

https://grid.space/stem/
74•cyrusradfar•2h ago•30 comments

Project Genie: Experimenting with infinite, interactive worlds

https://blog.google/innovation-and-ai/models-and-research/google-deepmind/project-genie/
436•meetpateltech•8h ago•222 comments

Where to Sleep in LAX

https://cadence.moe/blog/2025-12-30-where-to-sleep-in-lax
75•surprisetalk•6d ago•44 comments

Claude Code daily benchmarks for degradation tracking

https://marginlab.ai/trackers/claude-code/
541•qwesr123•11h ago•263 comments

Employers, please use postmarked letters for job applications

https://soapstone.mradford.com/employers-use-letters-for-job-applications/
41•MattyRad•2h ago•30 comments

Drug trio found to block tumour resistance in pancreatic cancer

https://www.drugtargetreview.com/news/192714/drug-trio-found-to-block-tumour-resistance-in-pancre...
218•axiomdata316•9h ago•113 comments

Compressed Agents.md > Agent Skills

https://vercel.com/blog/agents-md-outperforms-skills-in-our-agent-evals
132•maximedupre•12h ago•63 comments

The WiFi only works when it's raining (2024)

https://predr.ag/blog/wifi-only-works-when-its-raining/
62•epicalex•4h ago•21 comments

Flameshot

https://github.com/flameshot-org/flameshot
111•OsrsNeedsf2P•5h ago•41 comments

Cutting Up Curved Things (With Math)

https://campedersen.com/tessellation
17•ecto•2h ago•1 comments

Launch HN: AgentMail (YC S25) – An API that gives agents their own email inboxes

113•Haakam21•8h ago•131 comments

What the Success of Coding Agents Teaches Us about AI Systems in General

https://softwarefordays.com/post/software-is-mostly-all-you-need/
13•jbmilgrom•2h ago•9 comments

The Rise and Impending Fall of the Dental Cavity

https://www.cremieux.xyz/p/the-rise-and-impending-fall-of-the
28•MrBuddyCasino•6d ago•2 comments

The Value of Things

https://journal.stuffwithstuff.com/2026/01/24/the-value-of-things/
60•vinhnx•4d ago•24 comments

A lot of population numbers are fake

https://davidoks.blog/p/a-lot-of-population-numbers-are-fake
249•bookofjoe•11h ago•224 comments

Is the RAM shortage killing small VPS hosts?

https://www.fourplex.net/2026/01/29/is-the-ram-shortage-killing-small-vps-hosts/
109•neelc•9h ago•158 comments

Waymo robotaxi hits a child near an elementary school in Santa Monica

https://techcrunch.com/2026/01/29/waymo-robotaxi-hits-a-child-near-an-elementary-school-in-santa-...
302•voxadam•11h ago•538 comments

County pays $600k to pentesters it arrested for assessing courthouse security

https://arstechnica.com/security/2026/01/county-pays-600000-to-pentesters-it-arrested-for-assessi...
287•MBCook•6h ago•150 comments

Run Clawdbot/Moltbot on Cloudflare with Moltworker

https://blog.cloudflare.com/moltworker-self-hosted-ai-agent/
147•ghostwriternr•10h ago•54 comments

Deep dive into Turso, the "SQLite rewrite in Rust"

https://kerkour.com/turso-sqlite
105•unsolved73•10h ago•94 comments

Reflex (YC W23) Senior Software Engineer Infra

https://www.ycombinator.com/companies/reflex/jobs/Jcwrz7A-lead-software-engineer-infra
1•apetuskey•8h ago

EmulatorJS

https://github.com/EmulatorJS/EmulatorJS
86•avaer•6d ago•13 comments

How to choose colors for your CLI applications (2023)

https://blog.xoria.org/terminal-colors/
145•kruuuder•10h ago•81 comments

Box64 Expands into RISC-V and LoongArch territory

https://boilingsteam.com/box64-expands-into-risc-v-and-loong-arch-territory/
34•ekianjo•4d ago•2 comments

Show HN: Kolibri, a DIY music club in Sweden

https://kolibrinkpg.com/
34•EastLondonCoder•8h ago•9 comments

The Hallucination Defense

https://niyikiza.com/posts/hallucination-defense/
40•niyikiza•5h ago•112 comments

US cybersecurity chief leaked sensitive government files to ChatGPT: Report

https://www.dexerto.com/entertainment/us-cybersecurity-chief-leaked-sensitive-government-files-to...
393•randycupertino•9h ago•206 comments

The passive in English (2011)

https://languagelog.ldc.upenn.edu/nll/?p=2922
14•penetralium•4d ago•17 comments

AI's impact on engineering jobs may be different than expected

https://semiengineering.com/ais-impact-on-engineering-jobs-may-be-different-than-initial-projecti...
83•rbanffy•7h ago•155 comments