frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Why AI code fails differently: What I learned talking to 200 engineering teams

6•pomarie•1h ago
Hey HN, I'm Paul, co-founder of cubic (YC X25). Over the past few months, I've talked to 200+ engineering teams about how they're using AI to ship code.

I kept hearing the same pattern: some teams are shipping 10-15 AI PRs daily without issues. Others tried once, broke production, and gave up entirely.

The difference wasn't what I expected– it wasn't about model choice or prompt engineering.

---

One team shipped an AI-generated PR that took down their checkout flow.

Their tests and CI passed, but AI had "optimized" their payment processing by changing `queueAnalyticsEvent()` to `analytics.track()`. The analytics service has a 2-second timeout so when it's slow, payment processing times out.

In prod, under real load, 95th percentile latency went from 200ms to 8 seconds. Ended up with 3hh of downtime and $50k in lost revenue.

Everyone on that team knew you queue analytics events asynchronously, but that wasn't documented anywhere. It's just something they learned when analytics had an outage years ago.

*The pattern*

Traditional CI/CD catches syntax errors, type mismatches, test failures.

The problem is that AIs don't make these mistakes. (Or at least, tests and lints catch them before they get committed). The problem with AI is that it generates syntactically perfect code that violates your system's unwritten rules.

*The institutional knowledge problem*

Every codebase has landmines that live in engineers' heads, accumulated through incidents.

AIs can't know these, so they fall into the traps. It's then on the code reviewer to spot them.

*What the successful teams do differently*

They write constraints in plain English. Then AI enforces them semantically on every PR. Eg. "All routes in /billing/* must pass requireAuth and include orgId claim"

AI reads your code, understands the call graph, and blocks merges that violate the rules.

*The bottleneck*

When you're shipping 10x more code, validation becomes the constraint; not generation speed.

The teams shipping AI at scale aren't waiting for better models. They're using AI to validate AI-generated code against their institutional knowledge.

The gap between "AI that generates code" and "AI you can trust in production" isn't about model capabilities, it's about bridging the institutional knowledge gap.

Comments

pomarie•1h ago
We're building something at cubic that helps with this. You write your constraints in plain English, and AI enforces them semantically on every PR.

If you're curious, you can check it out here: https://cubic.dev

Happy to answer any questions about what we've seen working (or not working) across different teams.

GreenGames•1h ago
Super interesting take Paul. Curious btw, how are these teams actually encoding their “institutional knowledge” into constraints? Like is it some manual config or more like natural‑language rules that evolve with the codebase?
pomarie•1h ago
Good q! So it depends.

Some teams are using Claude or similar models in GitHub Actions, which automatically review PRs. The rules are basically natural language encoded in a YAML file that's committed in the codebase. Pretty lightweight to get started.

Other teams upgrade to dedicated tools like cubic. We have a feature where you can encode your rules either in our UI, or we're releasing a feature where you can write them directly in your codebase. We'll check them on every PR and leave comments when something violates a constraint.

The in-codebase approach is nice because the rules live next to the code they're protecting, so they evolve naturally as your system changes.

Prebake: A Straightforward Developer Platform for Kubernetes

https://github.com/prebake/prebake
1•ryan0x44•44s ago•1 comments

Let the Mind-Control Games Begin

https://www.nytimes.com/2025/11/12/science/brain-implants-technology-disability.html
1•pretext•1m ago•0 comments

Vortex – An extensible, state of the art columnar file format

https://github.com/vortex-data/vortex
1•rickette•2m ago•0 comments

GoDaddy is auctioning a 15-year-old .org from an FOSS volunteer group – help?

http://somosazucar.org/
1•icarito•3m ago•2 comments

New Glenn Mission NG-2

https://www.blueorigin.com/missions/ng-2
1•JPLeRouzic•3m ago•0 comments

Renowned exoplanet researcher is bringing quest to find explanets back Canada

https://www.cbc.ca/news/science/sara-seager-uoft-exoplanet-research-9.6971176
1•Teever•3m ago•0 comments

Astringent flavanol fires locus-noradrenergic system, regulates autonomic nerves

https://www.sciencedirect.com/science/article/pii/S2665927125002266
1•PaulHoule•5m ago•0 comments

The Duchess Who Invented Science Fiction

https://compellingsciencefiction.com/posts/the-duchess-who-invented-science-fiction.html
2•davnicwil•6m ago•0 comments

NLnet's €21.6M fund for open-source internet projects

https://nlnet.nl/commonsfund/
2•handystudio•8m ago•0 comments

Veilid: Distributed Decentralized Framework (From CultoftheDeadCow)

https://veilid.com/
1•0xbadcafebee•8m ago•0 comments

We analyzed 47,000 ChatGPT conversations. Here's what people use it for

https://www.washingtonpost.com/technology/2025/11/12/how-people-use-chatgpt-data
4•pseudolus•9m ago•2 comments

Python Strftime Cheatsheet

https://strftime.org/
1•data_ase•9m ago•0 comments

Convex raises $24M to reinvent back ends

https://news.convex.dev/convex-raises-24m/
2•janpio•10m ago•1 comments

Marble by World Labs: Multimodal world model to create and edit 3D worlds

http://marble.worldlabs.ai/
1•dmarcos•11m ago•0 comments

10× Faster Log Processing at Scale: Beating Logstash Bottlenecks with Timeplus

https://www.timeplus.com/post/beating-logstash-bottlenecks
1•gangtao•11m ago•0 comments

Northern Lights Dazzle U.S. Skies After Powerful Solar Storm

https://www.scientificamerican.com/article/northern-lights-dazzle-u-s-skies-after-powerful-solar-...
2•quapster•14m ago•1 comments

Teradar raises $150M for a sensor it says beats lidar and radar

https://techcrunch.com/2025/11/12/teradar-exits-stealth-with-an-all-weather-sensor-for-autonomy-a...
1•aganders3•14m ago•0 comments

Planchón-Peteroa volcano enters new eruptive phase

https://watchers.news/2025/11/11/planchon-peteroa-volcano-enters-new-eruptive-phase-chile-argenti...
1•wslh•14m ago•0 comments

Arch-delta Saves 80% Of Bandwidth On Upgrades

https://djugei.github.io/how-arch-delta-works/
1•birdculture•15m ago•0 comments

Haiku Activity and Contract Report, October 2025

https://www.haiku-os.org/blog/waddlesplash/2025-11-11-haiku_activity_contract_report_october_2025
3•todsacerdoti•18m ago•0 comments

The Al Bubble Is Worse Than You Think [video]

https://www.youtube.com/watch?v=-cdJQ8UyVLA
2•EPendragon•20m ago•0 comments

Project OSSAS: Custom LLMs to Process 100M Research Papers

https://inference.net/blog/project-aella
1•surprisetalk•20m ago•0 comments

LAION Dataset Explorer

https://aella.inference.net
1•surprisetalk•21m ago•0 comments

Google launches a lawsuit targeting text message scammers

https://www.npr.org/2025/11/12/nx-s1-5604857/google-lawsuit-phishing-text-message-scammers
3•speckx•21m ago•0 comments

Helm v4.0.0

https://github.com/helm/helm/releases/tag/v4.0.0
8•todsacerdoti•22m ago•0 comments

Show HN: Cancer diagnosis makes for an interesting RL environment for LLMs

2•dchu17•23m ago•0 comments

Show HN: AI music discovery for super nerds (Now on iOS AND macOS)

https://back2back.ai
1•pj4533•24m ago•0 comments

Extracting Playable Instrument Models from Short Audio Examples

https://blog.cochlea.xyz/resonancemodel.html
2•cochlear•24m ago•0 comments

DSearch: .NET IQueryable Dynamic Search Library

https://www.nuget.org/packages/DSearch
1•samuelhenshaw•25m ago•0 comments

VLC's keeper of the cone nets European free software gong

https://www.theregister.com/2025/11/12/vlc_guru_gong/
2•jjgreen•26m ago•0 comments