frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Do agents.md files help coding agents?

https://twitter.com/rasbt/status/2063649136323252397
15•smushback•3h ago
https://xcancel.com/rasbt/status/2063649136323252397

https://arxiv.org/abs/2602.11988

Comments

DigitalSea•1h ago
Yes
sam_lowry_•51m ago
The referenced paper says No.
Drakim•46m ago
You are absolutely right, thank you for pushing back. Upon further examination, I've confirmed that the referenced paper says no.

/s

kalaksi•34m ago
The paper actually says: "We find that all context files consistently increase the number of steps required to complete tasks. LLM-generated context files have a marginal negative effect on task success rates, while developer-written ones provide a marginal performance gain."

"Overall, our results suggest that context files have only marginal effect on agent behavior, and are likely only desirable when manually written."

zuzululu•1h ago
Not enough on its own you'd need artifacts to store contexts/TOC/lists

I think shorter the better.

also a strange finding from my own experiences: specific empirical formats seem to yield much better results. For example people often say "get this done to 100%" but I say "get this to 88.47%".

asp_hornet•1h ago
As the author notes in the end, it would be really interesting to do these again on more recent models. I wonder if the no context file being cheaper still stands. But then how much does the harness influence the results. It can be frustrating trying to gauge what’s influencing what and if something suddenly starts working against you.
wiseowise•1h ago
You putting “you’re an expert jerk off master” in agents.md is the same as shaman burning a bone to predict a future.
weddpros•37m ago
If adding something to the context doesn't help, it's only proves you're not adding the right stuff.

I'm adding pointers to specification documents, and it saves me from the /new dumb coding agent that sees your code base for the first time and knows nothing about architecture, concepts, code organisation, etc...

I'm using no cookie cutter directives though (except maybe "do not attempt to deploy, we're using CI CD to deploy" to avoid an automatic "wrangler deploy" to Cloudflare)

RugnirViking•23m ago
yes, they do. I think people overindex on this paper, I remember when it came out we had a lot of discussion in my company about it. But its clear to see they do at least change the agent's behavior, and things like telling it "always use xyz version of java, use gradle to build the project, use this command to run the tests" are really important instead of letting it fumble about trying to find the right thing every time you ask it anything

I think the problem some people fall into, and especially LLM authored ones (which is where they see the documents not helping here) is instead describing the code, or the structure of the code. Which I don't think helps much - the agent can already see you have 4 modules called a b c and d, and can read the readmes inside of them just fine if it has questions.

One more marginal thing I find helpful but im less sure has positive impact is describing the right terminology for the agent so it can be smarter at communicating with the developer. Things like different names for the product, products it interfaces with, resource names in infra, terms from the customer and product team. I don't think it helps the agent code (much) but it does help communication if it knows what we mean when we speak (and naming things is, as we know, one of the hard problems in CS)

Overall, most of my agents.md now are a list of useful bash commands for working and testing with the project & tests. (heres how to spin up docker services, heres how to update the libraries, heres how to run a command against the local db, heres how to insert a document to be run egtc)

and then at the end a terminology blob, which I find myself referencing too.

sebra•18m ago
The tweet misses the conclusion from the paper that handcrafted AGENTS.md might help. To me its no surprise that 100% vibed AGENTS.md are unproductive. Not reviewing your design docs is probably even worse than not reviewing your code? I've seen some AI-generated agents.md which were just plain wrong. No surprise agents perform worse after reading those.

I use AGENTS.md to make sure my agents loop effectively (tests, quality, etc). Not to describe the code / architecture.

kandros•1m ago
The amount of cargo-culting around AI tooling and practices is so weird to me.

Why not just try and see? The fast feedback loop allow testing all kind of weird theories in a matter of 30m-1h during normal working sessions, most results are obvious

Dopamine Fracking

https://igerman.cc/blog/dopamine-fracking/
243•igmn•5h ago•82 comments

1k Data Breaches Later, the Disclosure Lag Is Worse

https://www.troyhunt.com/1000-data-breaches-later-the-disclosure-lag-is-worse-than-ever/
130•882542F3884314B•5h ago•45 comments

APC–2 – A professional record cutter for producing original playback discs

https://teenage.engineering/products/apc-2
199•vthommeret•7h ago•96 comments

Building from zero after addiction, prison, and a felony

https://gavinray97.github.io/blog/building-from-zero-after-addiction-prison-felony
611•gavinray•14h ago•269 comments

The Smallest Brain You Can Build: A Perceptron in Python

https://ranpara.net/posts/perceptron-explained-from-scratch/
184•DevarshRanpara•8h ago•23 comments

Richard Scolyer Has Died

https://www.bbc.com/news/articles/c14yz5jg476o
41•nicwilson•4h ago•11 comments

Playing with Vision Embeddings

https://prestonbjensen.com/posts/playing-with-vision-embeddings
35•prestoj•2d ago•2 comments

DeepSeek V4 Pro beats GPT-5.5 Pro on precision

https://runtimewire.com/article/deepseek-v4-pro-beats-gpt-5-5-pro-on-precision
242•yogthos•7h ago•97 comments

Algorithmic Monocultures in Hiring

https://algorithmichiring.github.io/
81•drchiu•6h ago•32 comments

New drug 'functionally cures' many hepatitis B virus infections

https://www.science.org/content/article/new-drug-functionally-cures-many-hepatitis-b-virus-infect...
157•gmays•6h ago•24 comments

Tiny hackable CUDA language model implementation

https://github.com/markusheimerl/gpt
23•markusheimerl•2d ago•2 comments

Making peace with your unlived dreams (2023)

https://nik.art/making-peace-with-your-unlived-dreams/
218•herbertl•14h ago•120 comments

Show HN: I Derived a Pancake

https://www.absurdlyoptimized.com/recipes/pancakes/
227•bkazez•3d ago•88 comments

1worldflag: A blue dot on a transparent background

https://1worldflag.com/
108•davidbarker•7h ago•81 comments

Do agents.md files help coding agents?

https://twitter.com/rasbt/status/2063649136323252397
17•smushback•3h ago•11 comments

A Matter Wi-Fi Light Bulb in Rust on the Raspberry Pi Pico 2 W

https://github.com/melastmohican/rust-rpico2-embassy-examples
106•melastmohican•8h ago•13 comments

Trusted Computing Frequently Asked Questions (2003)

https://www.cl.cam.ac.uk/archive/rja14/tcpa-faq-1.0.html
11•userbinator•1d ago•0 comments

Giant Floating Victorian Drydock

https://mastermariners.org.au/stories-from-the-past/6481-the-world-s-largest-floating-dry-dock-wa...
10•dtj1123•1d ago•1 comments

Man-Computer Symbiosis J. C. R. Licklider (1960)

https://groups.csail.mit.edu/medg/people/psz/Licklider.html
34•rballpug•3d ago•2 comments

How's Linear so fast? A technical breakdown

https://performance.dev/how-is-linear-so-fast-a-technical-breakdown
400•howToTestFE•13h ago•181 comments

A discovery about GCC's unidirectional rotation algorithm

https://devblogs.microsoft.com/oldnewthing/20260603-00/?p=112378
20•soheilpro•4d ago•9 comments

Show HN: Lathe – Use LLMs to learn a new domain, not skip past it

https://github.com/devenjarvis/lathe
303•devenjarvis•21h ago•55 comments

What is the purpose of the lost+found folder in Linux and Unix? (2014)

https://unix.stackexchange.com/questions/18154/what-is-the-purpose-of-the-lostfound-folder-in-lin...
187•tosh•3d ago•65 comments

The 29th International Obfuscated C Code Contest (IOCCC) 2025 Winners

https://www.ioccc.org/2025/
392•matt_d•1d ago•89 comments

Texas grid flags risks as data centers, crypto sites fail voltage tests

https://www.reuters.com/business/energy/texas-grid-flags-risks-data-centers-crypto-sites-fail-vol...
92•1vuio0pswjnm7•6h ago•69 comments

LLMs are eroding my software engineering career and I don't know what to do

https://human-in-the-loop.bearblog.dev/llms-are-eroding-my-software-engineering-career-and-i-dont...
940•poisonfountain•19h ago•907 comments

Do we fear the serializable isolation level more than we fear subtle bugs (2024)

https://blog.ydb.tech/do-we-fear-the-serializable-isolation-level-more-than-we-fear-subtle-bugs-5...
78•b-man•4d ago•46 comments

Powering up a module from the IBM 604: an electronic calculator from 1948

https://www.righto.com/2026/06/ibm-604-thyraton-tube-module.html
87•elpocko•15h ago•25 comments

Cloning a Sennheiser BA2015 battery pack

https://blog.brixit.nl/cloning-a-sennheiser-ba2015-accu-pack/
128•zdw•1d ago•19 comments

Splash Is a Colour Format

https://www.todepond.com/lab/splash/
59•tobr•4d ago•83 comments