frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: CryptoClaw – open-source AI agent with built-in wallet and DeFi skills

https://github.com/TermiX-official/cryptoclaw
1•cryptoclaw•2m ago•0 comments

ShowHN: Make OpenClaw Respond in Scarlett Johansson’s AI Voice from the Film Her

https://twitter.com/sathish316/status/2020116849065971815
1•sathish316•4m ago•0 comments

CReact Version 0.3.0 Released

https://github.com/creact-labs/creact
1•_dcoutinho96•5m ago•0 comments

Show HN: CReact – AI Powered AWS Website Generator

https://github.com/creact-labs/ai-powered-aws-website-generator
1•_dcoutinho96•6m ago•0 comments

The rocky 1960s origins of online dating (2025)

https://www.bbc.com/culture/article/20250206-the-rocky-1960s-origins-of-online-dating
1•1659447091•11m ago•0 comments

Show HN: Agent-fetch – Sandboxed HTTP client with SSRF protection for AI agents

https://github.com/Parassharmaa/agent-fetch
1•paraaz•13m ago•0 comments

Why there is no official statement from Substack about the data leak

https://techcrunch.com/2026/02/05/substack-confirms-data-breach-affecting-email-addresses-and-pho...
5•witnessme•17m ago•1 comments

Effects of Zepbound on Stool Quality

https://twitter.com/ScottHickle/status/2020150085296775300
2•aloukissas•20m ago•1 comments

Show HN: Seedance 2.0 – The Most Powerful AI Video Generator

https://seedance.ai/
1•bigbromaker•23m ago•0 comments

Ask HN: Do we need "metadata in source code" syntax that LLMs will never delete?

1•andrewstuart•29m ago•1 comments

Pentagon cutting ties w/ "woke" Harvard, ending military training & fellowships

https://www.cbsnews.com/news/pentagon-says-its-cutting-ties-with-woke-harvard-discontinuing-milit...
6•alephnerd•32m ago•2 comments

Can Quantum-Mechanical Description of Physical Reality Be Considered Complete? [pdf]

https://cds.cern.ch/record/405662/files/PhysRev.47.777.pdf
1•northlondoner•32m ago•1 comments

Kessler Syndrome Has Started [video]

https://www.tiktok.com/@cjtrowbridge/video/7602634355160206623
1•pbradv•35m ago•0 comments

Complex Heterodynes Explained

https://tomverbeure.github.io/2026/02/07/Complex-Heterodyne.html
3•hasheddan•35m ago•0 comments

EVs Are a Failed Experiment

https://spectator.org/evs-are-a-failed-experiment/
3•ArtemZ•47m ago•5 comments

MemAlign: Building Better LLM Judges from Human Feedback with Scalable Memory

https://www.databricks.com/blog/memalign-building-better-llm-judges-human-feedback-scalable-memory
1•superchink•47m ago•0 comments

CCC (Claude's C Compiler) on Compiler Explorer

https://godbolt.org/z/asjc13sa6
2•LiamPowell•49m ago•0 comments

Homeland Security Spying on Reddit Users

https://www.kenklippenstein.com/p/homeland-security-spies-on-reddit
7•duxup•52m ago•0 comments

Actors with Tokio (2021)

https://ryhl.io/blog/actors-with-tokio/
1•vinhnx•53m ago•0 comments

Can graph neural networks for biology realistically run on edge devices?

https://doi.org/10.21203/rs.3.rs-8645211/v1
1•swapinvidya•1h ago•1 comments

Deeper into the shareing of one air conditioner for 2 rooms

1•ozzysnaps•1h ago•0 comments

Weatherman introduces fruit-based authentication system to combat deep fakes

https://www.youtube.com/watch?v=5HVbZwJ9gPE
3•savrajsingh•1h ago•0 comments

Why Embedded Models Must Hallucinate: A Boundary Theory (RCC)

http://www.effacermonexistence.com/rcc-hn-1-1
1•formerOpenAI•1h ago•2 comments

A Curated List of ML System Design Case Studies

https://github.com/Engineer1999/A-Curated-List-of-ML-System-Design-Case-Studies
3•tejonutella•1h ago•0 comments

Pony Alpha: New free 200K context model for coding, reasoning and roleplay

https://ponyalpha.pro
1•qzcanoe•1h ago•1 comments

Show HN: Tunbot – Discord bot for temporary Cloudflare tunnels behind CGNAT

https://github.com/Goofygiraffe06/tunbot
2•g1raffe•1h ago•0 comments

Open Problems in Mechanistic Interpretability

https://arxiv.org/abs/2501.16496
2•vinhnx•1h ago•0 comments

Bye Bye Humanity: The Potential AMOC Collapse

https://thatjoescott.com/2026/02/03/bye-bye-humanity-the-potential-amoc-collapse/
3•rolph•1h ago•0 comments

Dexter: Claude-Code-Style Agent for Financial Statements and Valuation

https://github.com/virattt/dexter
1•Lwrless•1h ago•0 comments

Digital Iris [video]

https://www.youtube.com/watch?v=Kg_2MAgS_pE
1•vermilingua•1h ago•0 comments
Open in hackernews

Ask HN: Why are AI coding agents not working for me?

3•rich_sasha•3w ago
I'm really trying to use them with an open mind. I'm writing detailed specs. On failure, I adjust the initial spec, rather than go down the spiral of asking for many adjustments. I'm using Claude Opus 4.5 inside Cursor. My ambitions are also quite low. The latest was to split a mega Python file into a few submodules according to a pretty simple criterion. It's not even that it failed, it is more about the how. It was trying to action the refactor by writing some Python one-liners to edit the file, in an extremely clumsy way - in many cases failing to write syntactically correct Python.

I'm torn, as I don't want to be an old man luddite shouting at the clouds "LLMs are garbage", and plenty of reasonable people seem to do well with them. But my experience is rather poor. So, maybe I'm holding it wrong?

It's not only failures, to be fair. I found it fairly good at writing a lot of benign code, like tests, simple tools I wouldn't bother with that save me a few mins here and there. But certainly nothing great. Also good at general queries and asking design questions. But not actually doing my job of being a programmer.

Googling the topic mostly yields various grifters' exclusive online courses in no-code get rich quick agents packed with AdWord keywords, or hyper optimised answers about having 100s of stored prompts hypertuned for the latest agent, but hoping for higher quality answers here.

Comments

mahaekoh•3w ago
I’m in the same boat. I’ve been working with a more junior engineer that’s ecstatic about AI coding and I’m slowly settling into the position that, for those of us that have developed tons of opinions about how things should be done, trying to transfer all that experience to an AI through prompting is just not efficient, and I’ve grown comfortable with saying it’s easier for me to do it myself than repeated prompting and adjusting prompts. For a more junior engineer, though, it’s a lot easier to accept what the AI did, and as long as it’s functional, their opinions aren’t strong enough to spark the urge to keep adjusting. Theres just a different utility curve for different people.

Does that mean we’ll get worse (or less opinionated) code over time? Maybe. I used to tell my team that code should be written to be easily understood by maintainers, but if all the maintainers are AI and they don’t care, does it matter?

FWIW, I still reach for Claude once in a while, and I find its response useful maybe one out of ten times, particularly when dealing with code I don’t feel the need to learn or maintain in the long run. But if reviewing Claude’s code requires me to learn the code base properly, often might as well write it myself.

seanmcdirmid•3w ago
I’m in the opposite boat, having trouble instructing my colleagues on how to get the same success with AI coding that I’ve realized. The issue is that you spend effort “working” the AI to get things done, but at the end of it your only artifact is a bunch of CLI commands executed and…how are you going to describe that?

AI instructions for AI coding really need to be their own code somehow, so programmers can more successfully share their experiences.

zahlman•3w ago
> but at the end of it your only artifact is a bunch of CLI commands executed

That sounds like a failure of process. Executing the commands is supposed to result in a Git commit history, and in principle nothing prevents logging the agent session. I'm told that some users even prompt the AI afterwards to summarize what happened in the session, write .md files to record "lessons learned", etc.

seanmcdirmid•3w ago
That isn't what I meant, you could save the entire CLI session, and not have something that can be shared easily. You need to document things like "try this or that, it still isn't very sharable.
enobrev•3w ago
I haven't used cursor, so I'm not sure I can be much help there. I've been mostly using claude code and IntelliJ IDEs for code-reviews when necessary. Over the past year I've moved to almost entirely coding via agent. Maybe my input will be helpful.

One very important thing to keep in mind is context management. Every time your agent reads a file, searches documentation, answers a question, writes a file, or otherwise iterates on a problem, the context will grow. The larger the context gets, the dumber the responses. It will basically start forgetting earlier parts of the conversation. To be explicit about this, I've disabled "auto-compact" in claude code and when I see a warning that it's getting too big, I cut things off, maybe ask the agent to commit, or write a summary, and then /compact or /clear. It's important to figure out the context limits of the model you're using and stay comfortably within them.

Next, I generally treat the agent like a mid-level engineer who answers to me. That is to say, I do not try to convince it to code like I do, instead I treat it like a member on my team. When I'm on a team, we stick to standards and use tools like prettier, etc to keep the code in shape. My personal preferences go out the window, unless there's solid technical reason for others to follow them.

With that out of the way, the general loop is to plan with the agent, spec the work to be done, let the agent do the work, review, and repeat. To start, I converse with the agent directly. I'm not writing a spec, I'm discussing the problem with the agent and asking the agent to write the spec. We review, and discuss, and once our decisions are aligned and documented, I'll ask it to break down how it would implement the plan we've agreed upon.

From there I'll keep the context size in mind. If implementation is a multi-hour endeavor, I'll work with the agent to break down the problem into pieces that should ideally fit into the context window. Otherwise, by this point the agent will have asked me "would you like me to go ahead and get started?" and I'll let it get started

Once it's done, I'll ask it to run lint, typechecks, automated testing, do a code review of what's in the current git workspace, compare the changes to the spec, do my own code reviews, run it myself, whatever is needed to make sure what was written solves the problem.

In general, I'd say it's a bad idea to just let the agent go off on its own with a giant task. It should be iterative and communicative. If the task is too big, it WILL take shortcuts. You can probably get an agent to rewrite your whole codebase with a big fancy prompt and a few markdown files. But if you're not part of the process, there's a good chance it'll create a serious mess.

For what you're doing, I would likely like ask the agent to read the mega python file and explain it to me. Then I would discuss what it missed or got wrong and add additional context and explain what needs to be done. Then I would ask it if it has any suggestions for how we should break it into submodules. If the plan looks good, run with it. If not, explain what you're going for and then ask how it would go about extracting the first submodule. If the plan looks good, ask it to write tests, let it extract the submodule, let it run the tests, review the results, do your own code review, tweak the formatting, Goto 10.

zahlman•3w ago
> Next, I generally treat the agent like a mid-level engineer who answers to me. That is to say, I do not try to convince it to code like I do, instead I treat it like a member on my team. When I'm on a team, we stick to standards and use tools like prettier, etc to keep the code in shape. My personal preferences go out the window, unless there's solid technical reason for others to follow them.

Do you suppose it would work to prompt a separate agent to infer coding style preferences from your commits and then refactor the first agent's work to bring it in line?

KellyCriterion•3w ago
For Claude, I can recommend to put / organize relevant source files of your app into the project context/container.

Im also in the opposite boat: Claude is such a boon, it allowed me to boost productivity. Though, I mainly used it always for single functions which Im integrating into the code base. Id say I have a hitquote of 90%+ on the first prompt. Just yesterday I re-prompted a data visualization component which was developed in the first iteration also with Claude (but Sonnet), I had to do 3 simple prompts to get some heavy optimization which wasnt done in the first iteration.

Also I have to say I like its capability to write useful comments a lot.

rich_sasha•3w ago
Yeah. It's good for tinkering around the edges, IME. But that's a far cry from replacing software developers!