frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

My tips for using LLM agents to create software

https://efitz-thoughts.blogspot.com/2025/08/my-experience-creating-software-with_22.html
61•efitz•6h ago

Comments

efitz•6h ago
I spent much of the last several months using LLM agents to create software. I've written two blog posts about my experience; this is the second post that includes all the things I've learned along the way to get better results, or at least waste less money.
afeezaziz•4h ago
you should write more about your experience using LLM. Is this solely using LLM?
xwowsersx•4h ago
This lines up with my own experience of learning how to succeed with LLMs. What really makes them work isn't so different from what leads to success in any setting: being careful up front, measuring twice and cutting once.
CuriouslyC•3h ago
If I paid for my API usage directly instead of the plan it'd be like a second mortgage.
3abiton•1h ago
To be fair, allocating some token for planning (recursively) helps a lot. It requires more hands on work, but produce much better results. Clarifying the tasks and breaking them down is very helpful too. Just you end up spending lots of time on it. On the bright side, Qwen3 30B is quite decent, and best of all "free".
manmal•1h ago
One weird trick is to tell the LLM to ask you questions about anything that’s unclear at this point. I tell it eg to ask up to 10 questions. Often I do multiple rounds of these Q&A and I‘m always surprised at the quality of the questions (w/ Opus). Getting better results that way, just because it reduces the degrees of freedom in which the agent can go off in a totally wrong direction.
deadbabe•44m ago
This is a little anthropomorphic. The faster option is to tell it to give you the full content of an ideal context for what you’re doing and adjust or expand as necessary. Less back and forth.
manmal•35m ago
Can you give me the full content of the ideal context of what you mean here?
rzzzt•15m ago
Certainly!
bbarnett•18m ago
Oh great.

LLM -> I've read 1000x stack overflow posts on this. The way coding works, is I produce sub-standard code, and then show it to others on stackoverflow! Others chime in with fixes!

You -> Get the LLM to simulate this process, by asking to to post its broken code, then asking for "help" on "stackoverflow" (eg, the questions it asks), and then after pasting the fix responses.

Hands down, you've discovered why LLM code is so junky all the time. Every time it's seen code on SO and other places, it's been "Here's my broken code" and then Q&A followed by final code. Statistically, symbolically, that's how (from an LLM perspective) coding tends to work.

Because of course many code examples it's seen are derived from this process.

So just go through the simulated exchange, and success.

And the best part is, you get to go through this process every time, to get the final fixed code.

rvz•1h ago
> I’m not a professional developer, just a hobbyist with aspirations

Stopped reading.

indigodaddy•1h ago
Why?
compootr•1h ago
I guess you need an active developer license to write blog posts
rvz•1h ago
Or maybe this industry still trusts experienced software engineers to write well maintained and robust software used by millions that make money.
rvz•1h ago
It's quite simple.

I perfer building and using software that is robust, heavily tested and thoroughly reviewed by highly experienced software engineers who understand the code, can detect bugs and can explain what each line of code they write does.

Today, we are now in the phase where embracing mediocre LLM generated code over heavily tested / scrutinized code is now encoraged in this industry - because of the hype of 'vibe coding'.

If you can't even begin to explain the code or point out any bugs generated by LLMs or even off-load architectural decisions to them, you're going to have a big problem in explaining that in code review situations or even in a professional pair-programming scenario.

exe34•54m ago
> I perfer building and using software that is robust, heavily tested and thoroughly reviewed by highly experienced software engineers who understand the code, can detect bugs and can explain what each line of code they write does.

that's amazing. by that logic you probably use like one or two pieces of software max. no windows, macos or gnome for you.

XenophileJKO•20m ago
LOL.. I was going to say after working in the tech industry.. half the time it is a rats nest in there.

There are excellent engineers.. but their are also many not so great engineers and once the sausage is made it usually isn't a pretty picture inside.

Usually only small young projects or maybe a beautiful component or two. Almost never an entire system/application.

rolisz•47m ago
Unfortunately, all of modern software depends on some random obscure dependency that is not properly reviewed https://xkcd.com/2347/
exe34•56m ago
> I'm doing a (free) operating system (just a hobby, won't be big and professional like gnu) for 386(486) AT clones.
navane•43m ago
If you kept reading you'd realize the guy was just humble bragging.
pmxi•1h ago
> If you are a heavy user, you should use pay-as-you go pricing

if you’re a heavy user you should pay for a monthly subscription for Claude Code which is significantly cheaper than API costs.

ramesh31•1h ago
Am I alone in spending $1k+/month on tokens? It feels like the most useful dollars i've ever spent in my life. The software I've been able to build on a whim over the last 6 months is beyond my wildest dreams from a a year or two ago.
zppln•1h ago
Care to show what you've built?
fainpul•1h ago
> The software I've been able to build on a whim over the last 6 months is beyond my wildest dreams from a a year or two ago.

If you don't mind sharing, I'm really curious - what kind of things do you build and what is your skillset?

tovej•6m ago
I would personally never. Do I want to spend all my time reviewing AI code instead of writing? Not really. I also don't like having a worse mental model of the software.

What kind of software are you building that you couldn't before?

athrowaway3z•1h ago
> One of the weird things I found out about agents is that they actually give up on fixing test failures and just disable tests. They’ll try once or twice and then give up.

Its important to not think in terms of generalities like this. How they approach this depends on your tests framework, and even on the language you use. If disabling tests is easy and common in that language / framework, its more likely to do it.

For testing a cli, i currently use run_tests.sh and never once has it tried to disable a test. Though that can be its own problem when it hits 1 it can't debug.

# run_tests.sh # Handle multiple script arguments or default to all .sh files

scripts=("${@/#/./examples/}")

[ $# -eq 0 ] && scripts=(./examples/*.sh)

for script in "${scripts[@]}"; do

    [ -n "$LOUD" ] && echo $script

    output=$(bash -x "$script" 2>&1) || {

        echo ""

        echo "Error in $script:"

        echo "$output"

        exit 1

    }
done

echo " OK"

----

Another tip. For a specific tasks don't bother with "please read file x.md", Claude Code (and others) accept the @file syntax which puts that into context right away.

Lucasoato•10m ago
I’ve seen going very successfully using both codex with gpt5 and claude code with opus. You develop a solution with one, then validate it with the other. I’ve fixed many bugs by passing the context between them saying something like: “my other colleague suggested that…”. Bonus thing: I’ve started using symlinks on CLAUDE.md files pointing at AGENTS.md, now I don’t even have to maintain two different context files.

I'm too dumb for Zig's new IO interface

https://www.openmymind.net/Im-Too-Dumb-For-Zigs-New-IO-Interface/
41•begoon•1h ago•10 comments

Show HN: JavaScript-free (X)HTML Includes

https://github.com/Evidlo/xsl-website
129•Evidlo•13h ago•63 comments

Shader Academy: Learn computer graphics by solving challenges

https://shaderacademy.com/
112•pykello•2d ago•22 comments

I run a full Linux desktop in Docker just because I can

https://www.howtogeek.com/i-run-a-full-linux-desktop-in-docker-just-because-i-can/
97•redbell•3d ago•46 comments

Measuring the environmental impact of AI inference

https://arstechnica.com/ai/2025/08/google-says-it-dropped-the-energy-cost-of-ai-queries-by-33x-in-one-year/
106•ksec•4h ago•52 comments

My tips for using LLM agents to create software

https://efitz-thoughts.blogspot.com/2025/08/my-experience-creating-software-with_22.html
61•efitz•6h ago•27 comments

Nitro: A tiny but flexible init system and process supervisor

https://git.vuxu.org/nitro/about/
180•todsacerdoti•12h ago•62 comments

The first Media over QUIC CDN: Cloudflare

https://moq.dev/blog/first-cdn/
218•kixelated•13h ago•95 comments

The theory and practice of selling the Aga cooker (1935) [pdf]

https://comeadwithus.wordpress.com/wp-content/uploads/2012/08/the-theory-and-practice-of-selling-the-aga-cooker.pdf
23•phpnode•2d ago•11 comments

Top Secret: Automatically filter sensitive information

https://thoughtbot.com/blog/top-secret
89•thunderbong•1d ago•8 comments

Glyn: Type-safe PubSub and Registry for Gleam actors with distributed clustering

https://github.com/mbuhot/glyn
46•TheWiggles•9h ago•8 comments

A visual history of Visual C++ (2017)

http://www.malsmith.net/blog/visual-c-visual-history/
37•rayanboulares•4h ago•20 comments

FFmpeg 8.0

https://ffmpeg.org/index.html#pr8.0
795•gyan•16h ago•178 comments

Lightning declines over shipping lanes following regulation of sulfur emissions

https://theconversation.com/the-world-regulated-sulfur-in-ship-fuels-and-the-lightning-stopped-249445
31•lentoutcry•4d ago•5 comments

The use of LLM assistants for kernel development

https://lwn.net/Articles/1032612/
29•Bogdanp•8h ago•8 comments

Japan city drafts ordinance to cap smartphone use at 2 hours per day

https://english.kyodonews.net/articles/-/59582
85•Improvement•5h ago•59 comments

From M1 MacBook to Arch Linux: A month-long experiment that became permanenent

https://www.ssp.sh/blog/macbook-to-arch-linux-omarchy/
78•articsputnik•3d ago•105 comments

Computer fraud laws used to prosecute leaking air crash footage to CNN

https://www.techdirt.com/2025/08/22/investigators-used-terrible-computer-fraud-laws-to-ensure-people-were-punished-for-leaking-air-crash-footage-to-cnn/
161•BallsInIt•7h ago•72 comments

Popular Japanese smartphone games have introduced external payment systems

https://english.kyodonews.net/articles/-/59689
121•anigbrowl•7h ago•62 comments

Leaving Gmail for Mailbox.org

https://giuliomagnifico.blog/post/2025-08-18-leaving-gmail/
246•giuliomagnifico•14h ago•263 comments

LabPlot: Free, open source and cross-platform Data Visualization and Analysis

https://labplot.org/
217•turrini•22h ago•37 comments

Embedding Text Documents with Qwen3

https://www.daft.ai/blog/embedding-millions-of-text-documents-with-qwen3
7•kiyanwang•4d ago•1 comments

Bluesky Goes Dark in Mississippi over Age Verification Law

https://www.wired.com/story/bluesky-goes-dark-in-mississippi-age-verification/
142•BallsInIt•8h ago•58 comments

Launch HN: BlankBio (YC S25) – Making RNA Programmable

51•antichronology•14h ago•25 comments

Why is this hard?

https://programmersstone.blog/posts/why-is-this-hard/
38•Bogdanp•2d ago•12 comments

The issue of anti-cheat on Linux (2024)

https://tulach.cc/the-issue-of-anti-cheat-on-linux/
111•todsacerdoti•1d ago•206 comments

Now, Together

https://natashajaffe.substack.com/p/now-together
11•mooreds•2d ago•1 comments

Transcribe music in abc with syntax highlighting

https://fugue-state.io/app?project=24024aab-22f1-43cc-abef-c1647cc59597
18•jonzudell•9h ago•7 comments

Closing the Nix gap: From environments to packaged applications for rust

https://devenv.sh/blog/2025/08/22/closing-the-nix-gap-from-environments-to-packaged-applications-for-rust/
65•domenkozar•15h ago•27 comments

It’s not wrong that "\u{1F926}\u{1F3FC}\u200D\u2642\uFE0F".length == 7 (2019)

https://hsivonen.fi/string-length/
166•program•1d ago•233 comments