frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
439•klaussilveira•6h ago•100 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
785•xnx•11h ago•475 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
151•isitcontent•6h ago•15 comments

How we made geo joins 400× faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
16•matheusalmeida•1d ago•0 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
137•dmpetrov•6h ago•60 comments

A century of hair samples proves leaded gas ban worked

https://arstechnica.com/science/2026/02/a-century-of-hair-samples-proves-leaded-gas-ban-worked/
78•jnord•3d ago•5 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
44•quibono•4d ago•3 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
254•vecti•8h ago•120 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
316•aktau•12h ago•155 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
181•eljojo•9h ago•124 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
315•ostacke•12h ago•85 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
398•todsacerdoti•14h ago•218 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
325•lstoll•12h ago•235 comments

Make Trust Irrelevant: A Gamer's Take on Agentic AI Safety

https://github.com/Deso-PK/make-trust-irrelevant
6•DesoPK•54m ago•2 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
48•phreda4•5h ago•8 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
15•kmm•4d ago•1 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
109•vmatsiiako•11h ago•34 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
188•i5heu•9h ago•131 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
145•limoce•3d ago•79 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
239•surprisetalk•3d ago•31 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
982•cdrnsf•15h ago•417 comments

I'm going to cure my girlfriend's brain tumor

https://andrewjrod.substack.com/p/im-going-to-cure-my-girlfriends-brain
53•ray__•3h ago•13 comments

FORTH? Really!?

https://rescrv.net/w/2026/02/06/associative
41•rescrv•14h ago•17 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
19•gfortaine•4h ago•2 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
4•gmays•1h ago•0 comments

Evaluating and mitigating the growing risk of LLM-discovered 0-days

https://red.anthropic.com/2026/zero-days/
36•lebovic•1d ago•11 comments

Show HN: Smooth CLI – Token-efficient browser for AI agents

https://docs.smooth.sh/cli/overview
77•antves•1d ago•57 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
60•SerCe•2h ago•47 comments

The Oklahoma Architect Who Turned Kitsch into Art

https://www.bloomberg.com/news/features/2026-01-31/oklahoma-architect-bruce-goff-s-wild-home-desi...
19•MarlonPro•3d ago•4 comments

Show HN: Slack CLI for Agents

https://github.com/stablyai/agent-slack
40•nwparker•1d ago•10 comments
Open in hackernews

Context engineering

https://chrisloy.dev/post/2025/08/03/context-engineering
98•chrisloy•3mo ago

Comments

elteto•3mo ago
Are there any open source examples of good context engineering or agent systems?
calebkaiser•3mo ago
Any of the "design patterns" listed in the article will have a ton of popular open source implementations. For structured generation, I think outlines is a particularly cool library, especially if you want to poke around at how constrained decoding works under the hood: https://github.com/dottxt-ai/outlines
CjHuber•3mo ago
I‘d consider DSPy to be one. While the prompts it is using are not the most elaborate, they are well tested and reliable
voidhorse•3mo ago
There is nothing precise about crafting prompts and context—it's just that, a craft. Even if you do the right thing and check some fuzzy boundary conditions using autoscorers, the model can still change out from beneath you at any point and totally alter the behavior of your system. There is no formal language here. After all, mathematics exists because natural language is notoriously imprecise.

The article has some good practical tips and it's not on the author but man I really wish we'd stop abusing the term "engineering" in a desperate attempt to stroke our own egos and or convince people to give us money. It's pathetic. Coming up with good inputs to LLMs is more art than science and it's a craft. Call a spade a spade.

qrios•3mo ago
I agree with you one hundred percent.

But: Interestingly, the behavior of LLMs in different contexts is also the subject of scientific research.

satisfice•3mo ago
My thoughts exactly. The author is saying we should think strategically about the use of context. Sure. Yes. But for that to qualify as engineering we need solid theory about how context works.

We don’t have that, yet. For instance experiments show that not all parts of the context window are equally well attended. Imagine trying to engineer a bridge when no one really knows how strong steel is.

skeeter2020•3mo ago
or how wide the river is year round
chrisweekly•3mo ago
"Context crafting", ok, sure. I think a lot of expert researchers (like simonw) would agree.
calebkaiser•3mo ago
I think it's fair to question the use of the term "engineering" throughout a lot of the software industry. But to be fair to the author, his focus in the piece is on design patterns that require what we'd commonly call software engineering to implement.

For example, his first listed design pattern is RAG. To implement such a system from scratch, you'd need to construct a data layer (commonly a vector database), retrieval logic, etc.

In fact I think the author largely agrees with you re: crafting prompts. He has a whole section admonishing "prompt engineering" as magical incantations, which he differentiates from his focus here (software which needs to be built around an LLM).

I understand the general uneasiness around using "engineering" when discussing a stochastic model, but I think it's worth pointing out that there is a lot of engineering work required to build the software systems around these models. Writing software to parse context-free grammars into masks to be applied at inference, for example, is as much "engineering" as any other common software engineering project.

amonks•3mo ago
long shot, apropos of nothing, just recognized your name:

If you are the cincinnatian poet Caleb Kaiser, we went to college together and I’d love to catch up. Email in profile.

If you aren’t, disregard this. Sorry to derail the thread.

calebkaiser•3mo ago
Hello friend!
alt187•3mo ago
Wow, this is incredible.
voidhorse•3mo ago
Agreed. I'm glad this thread could be a vehicle for this interaction!
grigio•3mo ago
I'd like a RSS feed of this blog..
vladsanchez•3mo ago
It's available, https://buttondown.com/chrisloy/rss but it's not in sync with the blog, just a single 2024 entry found. :shrug:
chrisloy•3mo ago
That's just a feed for my extremely occasional newsletter, this is the blog one: https://chrisloy.dev/rss.xml
chrisloy•3mo ago
Seems I broke this with a recent change! Reinstated: https://chrisloy.dev/rss.xml
aeve890•3mo ago
Are we still calling this things engineering?
skeeter2020•3mo ago
"professionally trained & legally responsible for the results" is definitely not the same thing as what we used to just call "good at googling".
aeve890•3mo ago
I'd say this shit is even worse that "good at googling". Literal incantation for stochastic machines is like just two notches above checking the horoscope.
calebkaiser•3mo ago
Based on the comments, I expected this to be slop listing a bunch of random prompt snippets from the author's personal collection.

I'm honestly a bit confused at the negativity here. The article is incredibly benign and reasonable. Maybe a bit surface level and not incredibly in depth, but at a glance, it gives fair and generally accurate summaries of the actual mechanisms behind inference. The examples it gives for "context engineering patterns" are actual systems that you'd need to implement (RAG, structured output, tool calling, etc.), not just a random prompt, and they're all subject to pretty thorough investigation from the research community.

The article even echoes your sentiments about "prompt engineering," down to the use of the word "incantation". From the piece:

> This was the birth of so-called "prompt engineering", though in practice there was often far less "engineering" than trial-and-error guesswork. This could often feel closer to uttering mystical incantations and hoping for magic to happen, rather than the deliberate construction and rigorous application of systems thinking that epitomises true engineering.

timr•3mo ago
There’s nothing particularly wrong with the article - it’s a superficial summary of stuff that has historically happened in the world of LLM context windows.

The problem is - and it’s a problem common to AI right now - you can’t generalize anything from it. The next thing that drives LLMs forward could be an extension of what you read about here, or it could be a totally random other thing. There are a million monkeys tapping on keyboards, and the hope is that someone taps out Shakespeare’s brain.

calebkaiser•3mo ago
I don't really understand this line of criticism, in this context.

What would "generalizing" the information in this article mean? I think the author does a good job of contextualizing most of the techniques under the general umbrella of in-context learning. What would it mean to generalize further beyond that?

simonw•3mo ago
Yes, and we've also decided that they deserve the title "engineering" more than software engineering does.

Most engineering disciplines have to deal with tolerances and uncertainty - the real world is non-deterministic.

Software engineering is easy in comparison because computers always do exactly what you tell them to do.

The ways LLMs fail (and the techniques you have to use to account for that) have more in common than physical engineering disciplines than software engineering does!

timr•3mo ago
lol. who is “we”? I honestly can’t tell if you’re being serious.

I’m going to start a second career in lottery “engineering”, since that’s a stochastic process too.

simonw•3mo ago
The "we" was a tongue-in-cheek reference to the "we" in the original question:

> Are we still calling this things engineering?

timr•3mo ago
Yeah, I understand the symmetry, but…it begs the question.
scuff3d•3mo ago
Lol. This has to be a troll. No way someone seriously wrote this and meant it.
simonw•3mo ago
Little bit of both.
voakbasda•3mo ago
In the absence of a clear indicator, either interpretation could be possible:

https://en.wikipedia.org/wiki/Poe's_law

cadamsdotcom•3mo ago
Yep. Consider woodworking - the wood you use might warp over time, or maybe part of it ends up in the sun or the thing you’ll make gets partly exposed to water.

Can you make a thing that’ll serve its purpose and look good for years under those constraints? A professional carpenter can.

We have it easy in software.

dingnuts•3mo ago
Woodworking is to civil engineering as being an IT help desk rep is to being a software engineer. Woodworking isn't engineering either. If you build a system with aspects you can measure and predictably tune, you're engineering. If you're making skilled alterations to an existing structure or system without applied math or science, you're partaking in a craft.

Software engineering blurs the lines, sure, but woodworking isn't engineering ever.

mpalmer•3mo ago
Physical engineers might scoff good-naturedly at an attempt by project managers to refer to work scheduling as "logistics engineering".

But they really shouldn't because obviously scheduling and logistics is difficult, involving a lot of uncertainty and tolerances.

timr•3mo ago
Uncertainty and tolerance implies that you have a predictable distribution in the first place.

Engineers are not just dealing with a world of total chaos, observing the output of the chaos, and cargo culting incantations that seem to work for right now [1]…oh wait nevermind we’re doing a different thing today! Have you tried paying for a different tool, because all of the real engineers are using Qwghlm v5 Dystopic now?

There’s actually real engineering going on in the training and refining of these models, but I personally wouldn’t include the prompting fad of the week to fall under that umbrella.

[1] I hesitate to write that sentence because there was a period where, say, bridges and buildings were constructed in this manner. They fell down a lot, and eventually we made predictable, consistent theoretical models that guide actual engineering, as it is practiced today. Will LLM stuff eventually get there? Maybe! But right now we’re still plainly in the phase of trying random shit and seeing what falls down.

bdangubic•3mo ago
exactly why calling this engineering is downright criminal
voidhorse•3mo ago
I completely agree that much of software engineering is not engineering, and building systems around LLMs is no better in this sense.

When the central component of your system is a black box that you cannot reason about, have no theory around, and have essentially no control over (a model update can completely change your system behavior) engineering is basically impossible from the start.

Practices like using autoscorers to try and constrain behaviors helps, but this doesn't make the enterprise any more engineering because of the black box problem. Traditional engineering disciplines are able to call themselves engineering only because they are built on sophisticated physical theories that give them a precise understanding of the behaviors of materials under specified conditions. No such precision is possible with LLMs, as far as I have seen.

The determinism of traditional computing isn't really relevant here and targets the wrong logical level. We engineer systems, not programs.

empath75•3mo ago
This is completely backwards. Engineers built steam engines first through trial and error and then eventually the laws of thermodynamics were invented to explain how steam engines work.

Trial and error and fumbling around and creating rules of thumbs for systems you don’t entirely understand is the purest form of engineering.

voidhorse•3mo ago
I would argue it's more correct to call that phase experimentation. I doubt the early manufacturers of steam machines would even call themselves engineers in a serious or precise sense. They were engineers in the sense of "builder of engine" as a specific object, but the term's meaning has evolved from that basic initial usage.

A discipline becomes engineering when we achieve a level of understanding. such that we can be mathematically precise about it. Of course experimentation and trial and error are a fundamental part of that process, but there's a reason we have a word to distinguish processes which become more certain and precise thereafter and why we don't just call anything and everything engineering of some form.

graemefawcett•3mo ago
I think it's still fair to call yourself an engineer while you're using a tool that might still be new. It doesn't change the principles of engineering just because you have a slightly different tool in your tool belt

You're right that we're still learning how to use them properly. If someone's purely sitting in front of an all-you-can-eat vibe coding machine and trying to one-shot themselves into a fortune with their next startup, then absolutely, they don't deserve to call themselves an engineer.

But just using AI as an assistive technology does not take away from your abilities as an engineer. Used properly, it can be a significant force multiplier

aeve890•3mo ago
>The ways LLMs fail (and the techniques you have to use to account for that) have more in common than physical engineering disciplines than software engineering does!

Ah yes, the God given free parameters in the Standard Model, including obviously the random seed of a transformer. What if just put 0 in the inference temperature? The randomness in llms is a technical choice to generate variations in the selection of the next token. Physical engineering? Come on.

andai•3mo ago
>just set temp to 0 to make LLMs deterministic

Does that really work? And is it affected by the almost continuous silent model updates? And gpt-5 has a "hidden" system prompt, even thru the API, which seemed to undergo several changes since launch...

simonw•3mo ago
It famously does not: https://thinkingmachines.ai/blog/defeating-nondeterminism-in...
Scipio_Afri•3mo ago
Hey Simon, do you have any posts diving into how one might be able to deal with evaluating LLMs or Machine Learning models in general when reproducibility is so difficult given non-determinism? Pytorch has an article on it https://docs.pytorch.org/docs/stable/notes/randomness.html but then doesn't really go into how one would then take this deterministic result, and evaluate a model that is in production (which would very likely need for performance reasons the non-determinism features enabled).

While this affects all models it seems, I think the case gets worse for in particular LLMs because I would imagine all backends, including proprietary ones, are batching users prompts. Other concurrent requests seem to change the output of your request, and then if there is even a one token change to the input or output token, especially on large inputs or outputs, the divergence can compound. Also vLLM's documentation mentions this: https://docs.vllm.ai/en/latest/usage/faq.html

So how does one do benchmarking of AI/ML models and LLMs reliably (lets ignore arguing over the flaws of the metrics themselves, and just the fact that the output for any particular input can diverge given the above). You'd also want to redo evals as soon as any hardware or software stack changes are made to the production environment.

Seems like one needs to setup a highly deterministic backend, by forcing non-deterministic behavior in pytorch and using a backend which doesn't do batching for an initial eval that would allow for troubleshooting and non-variation in output to get a better sense of how consistent the model without the noise of batching and non-deterministic GPU calculations/kernels etc.

However then, for production, when determinism isn't guaranteed because you'd need batching and non-determism for performance, I would think that one would want to do multiple runs in various real-world situations (such as multiple users doing all sorts of different queries at the same time) and do some sort of averaging of the results. But I'm not entirely sure, because I would imagine the types of queries other users are making would then change the results fairly significantly. I'm not sure how much the batching that vLLM does would change the results of the output; but vLLM does say that batching does influence changes in the outputs.

simonw•3mo ago
This is so hard! I don't yet have a great solution for this myself, but I've been collecting notes about this on my "evals" tag for a while: https://simonwillison.net/tags/evals/

The best writing I've seen about this is from Hamel Husain - https://hamel.dev/blog/posts/llm-judge/ and https://hamel.dev/blog/posts/evals-faq/ are both excellent.

andai•3mo ago
>But why aren’t LLM inference engines deterministic? One common hypothesis is that some combination of floating-point non-associativity and concurrent execution leads to nondeterminism based on which concurrent core finishes first. We will call this the “concurrency + floating point” hypothesis for LLM inference nondeterminism.

Dang, so we don't even know why it's not deterministic, or how to make it so? That's quite surprising! So if I'm reading this right, it doesn't just have to do with LLM providers cutting costs or making changes or whatever. You can't even get determinism locally. That's wild.

But I did read something just the other day about LLMs being invertible. It goes over my head but it sounds like they got a pretty reliable mapping from inputs to outputs, at least?

https://news.ycombinator.com/item?id=45758093

> Transformer components such as non-linear activations and normalization are inherently non-injective, suggesting that different inputs could map to the same output and prevent exact recovery of the input from a model's representations. In this paper, we challenge this view. First, we prove mathematically that transformer language models mapping discrete input sequences to their corresponding sequence of continuous representations are injective and therefore lossless, a property established at initialization and preserved during training. Second, we confirm this result empirically through billions of collision tests on six state-of-the-art language models, and observe no collisions.

The distinction here appears to be between the output tokens versus some sort of internal state?

aeve890•3mo ago
Strictly speaking, it should work. We don't have a _real_ RNG yet and with the same seed any random function becomes deterministic. But behind the blackbox of LLM providers who know what's tunned processing your request.

But my point stands. The non-deterministic nature of LLMs are implementation details, not even close to physical constraints as the parent comment suggest.

Scipio_Afri•3mo ago
There is inherent non-determinism in all machine learning models unless you explicitly configure pytorch or other frameworks to do determinism (https://docs.pytorch.org/docs/stable/notes/randomness.html). However, this is very unlikely to be done in models that are being run in production due to performance and other issues.
dingnuts•3mo ago
The tools mechanical and civil engineers use are predictable. You're confusing the things these engineers design, which have tolerances and things like that, with the tools themselves.

If an engineer built an internal combustion engine that misfired 60% of the time, it simply wouldn't work.

If an engineer measured things with a ruler that only measured correctly 40% of the time, that would be the apt analogy.

The tool isn't what makes engineering a practice, it's the rigor and the ability to measure and then use the measurements to predict outcomes to make things useful.

Can you predict the outcome from an LLM with an "engineered" prompt?

No, and you aren't qualified to even comment on it since your only claim to fame is a fucking web app

skylurk•3mo ago
Civil engineers deal with contractors who would misfire 100% of the time if they could get away with it.
simonw•3mo ago
> No, and you aren't qualified to even comment on it since your only claim to fame is a fucking web app

Whoa, where did that come from?

skylurk•3mo ago
I know, right? I did not predict that output either.
quequon•3mo ago
Then you just engineering'd.
graemefawcett•3mo ago
If you're claiming those to be the success ratios you're having with AI assisted engineering, perhaps the phrase context in, tokens out might help. The relationship is symmetrical I have found.

In general, the more constraints you apply on the solution space via context, the more likely the correct solution is to stabilize.

It also helps to engineer the solution in such a way that the correct solution is also the easiest and this the most likely.

It takes time, but like most skills can be learned.

quequon•3mo ago
Classic shilling behavior of the insufferably embarrassing: redefining words to the benefit of those who pay your bills to the confusion of everyone else.

The definition of engineering, according to people outside the pocket of the llm industry:

> The application of scientific and mathematical principles to practical ends such as the design, manufacture, and operation of efficient and economical structures, machines, processes, and systems.

How do these techniques apply scientific and mathematical principals?

I would argue to do either of those requires reproducibility, and yet somehow you are arguing the less reproducible something is that the more like "physical engineering" it becomes.

simonw•3mo ago
Being accused of shilling for saying that context engineering is closer to traditional engineering than software engineering is a new one for me.
j45•3mo ago
Engineering how to engineer things might be engineering in some ways.
Zababa•3mo ago
There was a good series interviewing people that worked in both software engineering and traditional engineering: https://www.hillelwayne.com/post/are-we-really-engineers/. The conclusion was that yes, a lot of what we do as software engineers is engineering.
sgt101•3mo ago
Why would I believe that any of this works? This is just some blokes idea of what people should do.

There is no evidence offered. No attempt to measure the benefits.

calebkaiser•3mo ago
Most of the inference techniques (what the author calls context engineering design patterns) listed here originally came from the research community, and there are tons of benchmarks measuring their effectiveness, as well as a great deal of research behind what is happening mechanistically with each.

As the author points out, many of the patterns are fundamentally about in-context learning, and this in particular has been subject to a ton of research from the mechanistic interpretability crew. If you're curious, I think this line of research is fascinating: https://transformer-circuits.pub/2022/in-context-learning-an...

sgt101•3mo ago
so why does the author not link to or reference this material so that other people can evaluate it?
dwaltrip•3mo ago
Imagine the gall of someone who just goes on the internet and writes something.
sgt101•3mo ago
Yeah - random baseless assertions are at the heart of progress.
alecco•3mo ago
This looks AI generated slop.
Balgair•3mo ago
I know this is a bit of a non sequitur but, on my feed just below your comment, some asked for the RSS for this blog. The juxtaposition of the two comments here is just soooo HN
8474_s•3mo ago
The only thing that passed the test of time,so far is specificity: if you ask for multiple things or vague things, you receive half-baked answers trying to cover all bases. If you ask for specific one thing and describe it, the answer quality goes up;e.g. LLMs creating multi-part content mix up the parts and qualities of them, so e.g. asking for Part 1*specific, will always get a better answer than "list all parts of X"(quality drops with length of list).