Ask HN: How did you scale AI development?

1•logicallee•2h ago

I have a medium sized project AI is developing with some guidance from me. (This is the only way I can put it, since I don't have expertise in the technologies it's using, it's like I'm managing its development.)

As I develop it, I run into regressions where previously working features become broken. I'd like to keep iterating on it this way, since I have built perfectly working applications with AI. Do you have any tips for me? How did you successfully scale developing with AI?

Comments

janpio•2h ago

Is the breaking functionality fully covered with tests, and the agent can and does run those tests when adding or changing things already? If not, that would be a promising approach to help the AI to not mess up. If yes, can that loop be further tightened to support the AI?

logicallee•2h ago

>Is the breaking functionality fully covered with tests,

Did you have success having AI iterate on code fully covered by tests?

I began to add tests, however, currently I am manually testing after each change. This is because I asked ChatGPT for a research study of best practices for AI development, which it produced here [1]. It suggested:

>Notably, some found that Claude’s first attempt often includes excess or "over-engineered" code. A candid blog post mentioned Claude as a "real master at shitting in the code" if not guided properly – it can "generate a ton of unnecessary code… even when you ask for minimalism, it will slap on a pile of code with useless tests that outsmart themselves and don’t work."

and:

>a developer noted they initially tried having Claude maintain extensive docs and tests for everything, but realized this added too many points of failure (the AI would waste effort updating documentation instead of focusing on code). Over-engineering the process can backfire.

Due to these reasons, I have been testing in a manual way between iterations. (Though I develop using ChatGPT 5 as well as Claude, depending on the task.)

[1] https://chatgpt.com/share/68fbaeea-f528-800b-b090-1bb6b3b2ca...

janpio•1h ago

Getting the agent to run tests definitely can have a very positive impact - it can actually realize itself that it broke something unrelated, and fix it (or easily be prompted if it gives up anyway).

Aside: I often remove some of the tests that seem superfluous to me, or explicitly ask for the minimal set of tests that still cover the functionality in the first place. Some models definitely can go "all in" on tests like a very eager intern that just learned about testing. For your cases where after a prompt you end up with broken functionality, just having an integration test that fails when the functionality breaks, might be enough.

Conjoint Analysis (Marketing)

The AI-collapse pre-mortem

Harvard Alumni Entrepreneurs

Gleam v1.13: Formalising external APIs

Retro Language Models: Rebuilding Karpathy's RNN in PyTorch

Why do people, like, say, 'like' so much? (2022)

Ion: A data access layer for TypeScript

86Box v5.1

How Accurate Are Polymarket's Odds?

It seems the best way to kill a UAV is with a UAV

How the Fed's ZIRP, Silicon Valley and Libertarians Paved the Way for Autocracy

Spurious Correlations

Some AI providers host "degraded [models] to cut costs or fit server capacity"

The Classic Flying Toasters Screensaver for macOS 11 and Above

Generative AI is a societal disaster

How Cyber Intelligence Is Transforming Legal Strategy

Justice Department to Monitor Polling Sites in California, New Jersey

Apple begins shipping American-made AI servers from Texas

Show HN: Music Mini Games - an iOS app

PrimaLend's Unpaid Lenders Blast Bankruptcy for Missing Key Unit

Measured AI

Google Earth's expanded AI features make it easier to ask it questions

Vitest 4.0 Is Out

Tell HN: Locked out of Google account – 63K subscriber YouTube channel

Optical Illusions, Lightness Constancy

Btrfs, Quick Start

Jonesing for the Next Disruptor

What is life? (1944) [pdf]

Silicon island: How Ireland became a semiconductor powerhouse

The Pot, the Kettle, and the Elephant