frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: Why are AI coding agents not working for me?

2•rich_sasha•1h ago
I'm really trying to use them with an open mind. I'm writing detailed specs. On failure, I adjust the initial spec, rather than go down the spiral of asking for many adjustments. I'm using Claude Opus 4.5 inside Cursor. My ambitions are also quite low. The latest was to split a mega Python file into a few submodules according to a pretty simple criterion. It's not even that it failed, it is more about the how. It was trying to action the refactor by writing some Python one-liners to edit the file, in an extremely clumsy way - in many cases failing to write syntactically correct Python.

I'm torn, as I don't want to be an old man luddite shouting at the clouds "LLMs are garbage", and plenty of reasonable people seem to do well with them. But my experience is rather poor. So, maybe I'm holding it wrong?

It's not only failures, to be fair. I found it fairly good at writing a lot of benign code, like tests, simple tools I wouldn't bother with that save me a few mins here and there. But certainly nothing great. Also good at general queries and asking design questions. But not actually doing my job of being a programmer.

Googling the topic mostly yields various grifters' exclusive online courses in no-code get rich quick agents packed with AdWord keywords, or hyper optimised answers about having 100s of stored prompts hypertuned for the latest agent, but hoping for higher quality answers here.

Comments

mahaekoh•1h ago
I’m in the same boat. I’ve been working with a more junior engineer that’s ecstatic about AI coding and I’m slowly settling into the position that, for those of us that have developed tons of opinions about how things should be done, trying to transfer all that experience to an AI through prompting is just not efficient, and I’ve grown comfortable with saying it’s easier for me to do it myself than repeated prompting and adjusting prompts. For a more junior engineer, though, it’s a lot easier to accept what the AI did, and as long as it’s functional, their opinions aren’t strong enough to spark the urge to keep adjusting. Theres just a different utility curve for different people.

Does that mean we’ll get worse (or less opinionated) code over time? Maybe. I used to tell my team that code should be written to be easily understood by maintainers, but if all the maintainers are AI and they don’t care, does it matter?

FWIW, I still reach for Claude once in a while, and I find its response useful maybe one out of ten times, particularly when dealing with code I don’t feel the need to learn or maintain in the long run. But if reviewing Claude’s code requires me to learn the code base properly, often might as well write it myself.

seanmcdirmid•1h ago
I’m in the opposite boat, having trouble instructing my colleagues on how to get the same success with AI coding that I’ve realized. The issue is that you spend effort “working” the AI to get things done, but at the end of it your only artifact is a bunch of CLI commands executed and…how are you going to describe that?

AI instructions for AI coding really need to be their own code somehow, so programmers can more successfully share their experiences.

enobrev•1h ago
I haven't used cursor, so I'm not sure I can be much help there. I've been mostly using claude code and IntelliJ IDEs for code-reviews when necessary. Over the past year I've moved to almost entirely coding via agent. Maybe my input will be helpful.

One very important thing to keep in mind is context management. Every time your agent reads a file, searches documentation, answers a question, writes a file, or otherwise iterates on a problem, the context will grow. The larger the context gets, the dumber the responses. It will basically start forgetting earlier parts of the conversation. To be explicit about this, I've disabled "auto-compact" in claude code and when I see a warning that it's getting too big, I cut things off, maybe ask the agent to commit, or write a summary, and then /compact or /clear. It's important to figure out the context limits of the model you're using and stay comfortably within them.

Next, I generally treat the agent like a mid-level engineer who answers to me. That is to say, I do not try to convince it to code like I do, instead I treat it like a member on my team. When I'm on a team, we stick to standards and use tools like prettier, etc to keep the code in shape. My personal preferences go out the window, unless there's solid technical reason for others to follow them.

With that out of the way, the general loop is to plan with the agent, spec the work to be done, let the agent do the work, review, and repeat. To start, I converse with the agent directly. I'm not writing a spec, I'm discussing the problem with the agent and asking the agent to write the spec. We review, and discuss, and once our decisions are aligned and documented, I'll ask it to break down how it would implement the plan we've agreed upon.

From there I'll keep the context size in mind. If implementation is a multi-hour endeavor, I'll work with the agent to break down the problem into pieces that should ideally fit into the context window. Otherwise, by this point the agent will have asked me "would you like me to go ahead and get started?" and I'll let it get started

Once it's done, I'll ask it to run lint, typechecks, automated testing, do a code review of what's in the current git workspace, compare the changes to the spec, do my own code reviews, run it myself, whatever is needed to make sure what was written solves the problem.

In general, I'd say it's a bad idea to just let the agent go off on its own with a giant task. It should be iterative and communicative. If the task is too big, it WILL take shortcuts. You can probably get an agent to rewrite your whole codebase with a big fancy prompt and a few markdown files. But if you're not part of the process, there's a good chance it'll create a serious mess.

For what you're doing, I would likely like ask the agent to read the mega python file and explain it to me. Then I would discuss what it missed or got wrong and add additional context and explain what needs to be done. Then I would ask it if it has any suggestions for how we should break it into submodules. If the plan looks good, run with it. If not, explain what you're going for and then ask how it would go about extracting the first submodule. If the plan looks good, ask it to write tests, let it extract the submodule, let it run the tests, review the results, do your own code review, tweak the formatting, Goto 10.

Vibe Coding Debt: The Security Risks of AI-Generated Codebases

https://instatunnel.my/blog/vibe-coding-debt-the-security-risks-of-ai-generated-codebases
1•birdculture•1m ago•0 comments

Even Linus Torvalds is vibe coding now

https://www.zdnet.com/article/linus-torvalds-vibe-coding-ai/
1•isaacfrond•2m ago•0 comments

Working with Ruby Threads

https://workingwithruby.com/wwrt/intro
1•gmac•3m ago•0 comments

The Day AI Defeated Google (As Its Own Owner)

https://ai-404.medium.com/the-day-ai-defeated-google-as-its-own-owner-2fc1372cd2cc
1•martinambrus•3m ago•0 comments

Operation Tailwind War Crime

https://en.wikipedia.org/wiki/Operation_Tailwind
1•barrister•4m ago•0 comments

macOS 26's Cut Corners

https://daringfireball.net/2026/01/resizing_windows_macos_26
1•7777777phil•7m ago•0 comments

Burroughs B21 / Convergent AWS Vintage Computer Restoration – Dr. Scott M. Baker

https://www.smbaker.com/burroughs-b21-convergent-aws-vintage-computer-restoration
1•rbanffy•8m ago•0 comments

My AI resources packed together

https://mind-sculptor-engine.lovable.app/
1•tvali•9m ago•1 comments

I asked Opus 4.5 to make a Rust implementation of PyNNDescent

https://twitter.com/leland_mcinnes/status/2009738982712627433
1•tomthe•11m ago•1 comments

The Foundation Every Design System Gets Wrong

https://www.designsystemscollective.com/spacing-systems-the-foundation-every-design-system-gets-w...
2•vednig•14m ago•0 comments

Klarna boss backs interest rate cap on credit cards

https://www.thetimes.com/business/companies-markets/article/klarna-boss-backs-trump-10-percent-in...
1•petethomas•16m ago•0 comments

Show HN: Oubli – Persistent fractal memory for Claude Code

https://github.com/dremok/oubli
1•dremok•20m ago•0 comments

Helping promote the Lax programming language

1•Mavox-ID•31m ago•3 comments

Show HN: Stove – Kotlin-first E2E testing for JVM Back end apps(Ktor,SpringBoot)

https://github.com/Trendyol/stove
1•osoykan•32m ago•0 comments

In Memoriam: The Academic Journal

https://ieeexplore.ieee.org/document/11134631
1•jruohonen•32m ago•0 comments

Agnostic library without code, only specs and tests

https://github.com/dbreunig/whenwords
1•nesk_•33m ago•0 comments

State of DataHaskell Q1 2026

https://www.datahaskell.org/blog/2026/01/12/state-of-datahaskell-q1-2026.html
4•todsacerdoti•35m ago•0 comments

Show HN: Shorta – analyze a YouTube Short → generate a storyboard → re-film

https://shorta.ai
1•eguitarz•38m ago•0 comments

Ask HN: Are you paying for AWS support, and is it worth the cost?

1•oriettaxx•42m ago•1 comments

Agent-browser by Vercel: Browser automation CLI for AI agents

https://github.com/vercel-labs/agent-browser
1•handfuloflight•43m ago•0 comments

Norway reaches 97% EV sales as EVs now outnumber diesels on its roads

https://electrek.co/2026/01/02/norway-reaches-97-ev-sales-as-evs-now-outnumber-diesels-on-its-roads/
2•smurda•44m ago•0 comments

/R/Atlanta Has New Mods: Here's What Happened

https://old.reddit.com/r/Atlanta/comments/1qbabii/ratlanta_has_new_mods_heres_what_happened/
2•echelon•45m ago•1 comments

How and for Whom Using Generative AI Affects Creativity: A Field Experiment

https://psycnet.apa.org/fulltext/2026-29702-001.html
1•EagnaIonat•45m ago•0 comments

Comitis Capital announces the acquisition of Threema

https://comitiscapital.com/news/comitis-capital-announces-the-acquisition-of-threema
1•colinprince•47m ago•1 comments

Show HN: Speaking Time Calculator – Estimate speech duration from text

https://speakingtimecalculator.org
1•pandaupup•48m ago•0 comments

Node.js fs polyfill for browser using OPFS

https://www.npmjs.com/package/@componentor/fs
1•steffanhalv•49m ago•0 comments

Where Have All the Pithiatics Gone?

https://sydneyreviewofbooks.com/reviews/where-have-all-the-pithiatics-gone
1•prismatic•53m ago•0 comments

India demands crypto outfits geolocate customers, get a selfie to prove real

https://www.theregister.com/2026/01/13/india_crypto_kyc_aml_update/
1•Brajeshwar•53m ago•0 comments

How Brain waves shape our sense of self

https://news.ki.se/how-brain-waves-shape-our-sense-of-self
1•XzetaU8•55m ago•0 comments

Asus ROG Zephyrus Duo 2026: Twin 16:10 3K OLED screens

https://rog.asus.com/laptops/rog-zephyrus/rog-zephyrus-duo-2026/spec/
2•jskherman•56m ago•1 comments