frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

New Research Reassesses the Value of Agents.md Files for AI Coding

https://www.infoq.com/news/2026/03/agents-context-file-value-review/
14•noemit•2h ago

Comments

verdverm•2h ago
That research has been so misinterpreted for headlines and clicks...

AGENTS.md are extremely helpful if done well.

lucketone•1h ago
Everybody thinks they do agents.md well
noemit•2h ago
The research mostly points to LLM-generated context lowering performance. Human-generated context improves performance, but any kind of AGENTS.md file increases token use, for what they say is "fake thinking." More research is needed.
d1sxeyes•1h ago
Agree. Also, sometimes I intentionally want the agent to do something differently to how it would naturally solve the problem. For example, there might be a specific design decision that the agent should adhere to. Obviously, this will lead to slower task completion, higher inference costs etc. because I’m asking the agent not to take the path of least resistance.

This kind of benchmark completely misses that nuance.

stingraycharles•1h ago
I’d say that it needs to be maintained and reviewed by a human, but it’s perfectly fine to let an LLM generate it.
sheept•31m ago
If you let an LLM generate it (e.g. Claude's /init), it'll be a lot more verbose then it needs to be, which wastes tokens and deemphasizes any project-specific preferences you actually want the agent to heed.
stingraycharles•1h ago
What is going on in this thread and why are all comments downvoted so heavily?
nayroclade•1h ago
I suspect AGENTS.md files will prove to be a short-lived relic of an era when we had to treat coding agents like junior devs, who often need explicit instructions and guardrails about testing, architecture, repo structure, etc. But when agents have the equivalent (or better) judgement ability as a senior engineer, they can make their own calls about these aspects, and trying to "program" their behaviour via an AGENTS.md file becomes as unhelpful as one engineer trying to micro-manage another's approach to solving a problem.
sdenton4•32m ago
Eh, even for a senior engineer, dropping into a new codebase is greatly helped by an orientation from someone who works on the code. What's where, common gotchas, which tests really matter, and so on. The agents file serves a similar role.
dev_l1x_be•1h ago
I never use these files and give the current guardrails of a specific task to each short run for agents. Have task specific “agents.md” works better for me.
CrzyLngPwd•46m ago
I have a legacy codebase of around 300k lines spread across 1.5k files, and have had amazing success with the agents.md file.

It just prevents hallucinations and coerces the AI to use existing files and APIs instead of inventing them. It also has gold-standard tests and APIs as examples.

Before the agents file, it was just chaos of hallucinations and having to correct it all the time with the same things.

OutOfHere•29m ago
You might have better luck with more focused task-specific instructions if you can be bothered to write them.
lmeyerov•35m ago
I liked they did this work + its sister paper, but disliked how it was positioned basically opposite of the truth. It set up the community to misinterpreting it from a quick read, punishing people for a quick title scan or abstract scan. So for the next X months, instead of the paper helping, we have to deal with the brain damage.

The good: It shows on one kind of benchmark, some flavors of agentically-generated docs don't help on that task. So naively generating these, for one kind of task, doesn't work. Thank you, useful to know!

The bad: Some people assume this means in general these don't work, or automation can't generate useful ones.

The truth: These files help measurably, and just a bit of engineering enables you to guarantee high scores for the typical cases. As soon as you have an objective function, you can flip it into an eval, and set an AI coder to editing these files until they work.

Ex: We recently released https://github.com/graphistry/graphistry-skills for more easily using graphistry via AI coding, and by having our authoring AI loop a bit with our evals, we jumped the scores from 30-50% success rate to 90%+. As we encounter more scenarios (and mine them from our chats etc), it's pretty straight forward to flip them into evals and ask Claude/Codex to loop until those work well too.

We do these kind of eval-driven AI coding loops all the time , and IMO how to engineer these should be the message, not that they don't work on average. Deeper example near the middle/end of the talk here: https://media.ccc.de/v/39c3-breaking-bots-cheating-at-blue-t...

OutOfHere•32m ago
Duplicate of https://news.ycombinator.com/item?id=47280099

SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via CI

https://arxiv.org/abs/2603.03823
39•mpweiher•2h ago•3 comments

Cloud VM benchmarks 2026

https://devblog.ecuadors.net/cloud-vm-benchmarks-2026-performance-price-1i1m.html
235•dkechag•9h ago•101 comments

"Warn about PyPy being unmaintained"

https://github.com/astral-sh/uv/pull/17643
182•networked•8h ago•64 comments

Show HN: Curiosity – DIY 6" Newtonian Reflector Telescope

https://curiosity-telescope.vercel.app/
20•big_Brain69•2h ago•2 comments

From RGB to L*a*b* color space (2024)

https://kaizoudou.com/from-rgb-to-lab-color-space/
34•kqr•4d ago•7 comments

CasNum

https://github.com/0x0mer/CasNum
281•aebtebeten•13h ago•35 comments

Notes on Writing WASM

https://notes.brooklynzelenka.com/Blog/Notes-on-Writing-Wasm
4•vinhnx•49m ago•0 comments

How to run Qwen 3.5 locally

https://unsloth.ai/docs/models/qwen3.5
160•Curiositry•10h ago•45 comments

Rijksmuseum researchers discover new painting by Rembrandt van Rijn

https://www.rijksmuseum.nl/en/press/press-releases/rijksmuseum-researchers-discover-new-painting-...
12•ohjeez•3d ago•0 comments

MonoGame: A .NET framework for making cross-platform games

https://github.com/MonoGame/MonoGame
78•azhenley•7h ago•47 comments

A decade of Docker containers

https://cacm.acm.org/research/a-decade-of-docker-containers/
301•zacwest•17h ago•202 comments

Emacs internals: Deconstructing Lisp_Object in C (Part 2)

https://thecloudlet.github.io/blog/project/emacs-02/
72•thecloudlet•2d ago•3 comments

Dumping Lego NXT firmware off of an existing brick (2025)

https://arcanenibble.github.io/dumping-lego-nxt-firmware-off-of-an-existing-brick.html
206•theblazehen•2d ago•11 comments

Yoghurt delivery women combatting loneliness in Japan

https://www.bbc.com/travel/article/20260302-the-yoghurt-delivery-women-combatting-loneliness-in-j...
290•ranit•21h ago•156 comments

I'm Not Consulting an LLM

https://lr0.org/blog/p/gpt/
25•birdculture•1h ago•3 comments

Show HN: A weird thing that detects your pulse from the browser video

https://pulsefeedback.io/
83•kilroy123•3d ago•38 comments

Digital Iris [video]

https://www.youtube.com/watch?v=Kg_2MAgS_pE
7•surprisetalk•3d ago•1 comments

Best performance of a C++ singleton

https://andreasfertig.com/blog/2026/03/best-performance-of-a-cpp-singleton/
34•jandeboevrie•1d ago•20 comments

Autoresearch: Agents researching on single-GPU nanochat training automatically

https://github.com/karpathy/autoresearch
118•simonpure•13h ago•32 comments

The surprising whimsy of the Time Zone Database

https://muddy.jprs.me/links/2026-03-06-the-surprising-whimsy-of-the-time-zone-database/
124•jprs•15h ago•36 comments

In 1985 Maxell built a bunch of life-size robots for its bad floppy ad

https://buttondown.com/suchbadtechads/archive/maxell-life-size-robots/
111•rfarley04•3d ago•13 comments

Ten years of deploying to production

https://brandonvin.github.io/2026/03/04/ten-years-of-deploying-to-production.html
30•mooreds•2d ago•4 comments

Ask HN: Why there are no actual studies that show AI is more productive?

17•make_it_sure•1h ago•16 comments

FLASH radiotherapy's bold approach to cancer treatment

https://spectrum.ieee.org/flash-radiotherapy
212•marc__1•18h ago•65 comments

macOS code injection for fun and no profit (2024)

https://mariozechner.at/posts/2024-07-20-macos-code-injection-fun/
94•jstrieb•3d ago•17 comments

Files are the interface humans and agents interact with

https://madalitso.me/notes/why-everyone-is-talking-about-filesystems/
222•malgamves•23h ago•121 comments

Sem – Semantic version control. Entity-level diffs on top of Git

https://github.com/ataraxy-labs/sem
6•pabs3•4h ago•0 comments

Lisp-style C++ template meta programming

https://github.com/mistivia/lmp
51•mistivia•11h ago•6 comments

SigNoz (YC W21) is hiring for engineering, growth and product roles

https://signoz.io/careers
1•pranay01•17h ago

To the Polypropylene Makers

https://www.lesswrong.com/posts/HQTueNS4mLaGy3BBL/here-s-to-the-polypropylene-makers
31•raldi•3h ago•4 comments