LLMs work best when the user defines their acceptance criteria first

https://blog.katanaquant.com/p/your-llm-doesnt-write-correct-code

106•dnw•3h ago

Comments

marginalia_nu•2h ago

I tried to make Claude Code, Sonnet 4.6, write a program that draws a fleur-de-lis.

No exaggeration it floundered for an hour before it started to look right.

It's really not good at tasks it has not seen before.

tartoran•2h ago

Have you tried describing to Claude what it is? The more the detail the better the result. At some point it does become easier to just do it yourself.

vdfs•2h ago

Most people just forget to tell it "make it quick" and "make no mistake"

mekael•2h ago

I’m unable to determine if you’re missing /s or not.

tartoran•2h ago

That's kind of foolish IMO. How can an open ended generic and terse request satisfy something users have in mind?

marginalia_nu•2h ago

It knows what it is, it's a very well known symbol. But translating that knowledge to code is something else.

Interesting shortcoming, really shows how weak the reasoning is.

cat_plus_plus•2h ago

Try writing code from description without looking at the picture or generated graphics. Visual LLM with a suggestion to find coordinates of different features and use lines/curves to match them might do better.

comex•2h ago

LLMs are really bad at anything visual, as demonstrated by pelicans riding bicycles, or Claude Plays Pokémon.

Opus would probably do better though.

tartoran•2h ago

How could they be any good at visuals? They are trained on text after all.

msephton•2h ago

Shapes can be described as text or mathematical formulas.

comex•2h ago

Supposedly the frontier LLMs are multimodal and trained on images as well, though I don't know how much that helps for tasks that don't use the native image input/output support.

Whatever the cause, LLMs have gotten significantly better over time at generating SVGs of pelicans riding bicycles:

https://simonwillison.net/tags/pelican-riding-a-bicycle/

But they're still not very good.

tartoran•1h ago

I have to admit I'm seeing this for the first time and am somewhat impressed by the results and even think they will get better with more training, why not... But are these multimodal LLMs still LLMs though? I mean, they're still LLMs but with a sidecar that does other things and the training of the image takes place outside the LLMs so in a way the LLMs still don't "know" anything about these images, they're just generating them on the fly upon request.

boxedemp•37m ago

Maybe we should drop one of the L's

tempest_•2h ago

An SVG is just text.

astrange•2h ago

Claude is multimodal and can see images, though it's not good at thinking in them.

jshmrsn•2h ago

Considering that a fleur-de-lis involves somewhat intricate curves, I think I'd be pretty happy with myself if I could get that task done in an hour.

Given a harness that allows the model to validate the result of its program visually, and given the models are capable of using this harness to self correct (which isn't yet consistently true), then you're in a situation where in that hour you are free to do some other work.

A dishwasher might take 3 hours to do for what a human could do in 30 minutes, but they're still very useful because the machine's labor is cheaper than human labor.

marginalia_nu•2h ago

I didn't provide any constraints on how to draw it.

TBH I would have just rendered a font glyph, or failing that, grabbed an image.

Drawing it with vector graphics programmatically is very hard, but a decent programmer would and should push back on that.

zeroxfe•2h ago

> TBH I would have just rendered a font glyph, or failing that, grabbed an image.

If an LLM did that, people would be all up in arms about it cheating. :-)

For all its flaws, we seem to hold LLMs up to an unreasonably high bar.

marginalia_nu•1h ago

That's the job description for a good programmer though. Question assumptions and requirements, and then find the simplest solution that does the job.

Just about anyone can eventually come up with a hideously convoluted HeraldicImageryEngineImplFactory<FleurDeLis>.

ehnto•2h ago

Even with well understood languages, if there isn't much in the public domain for the framework you're using it's not really that helpful. You know you're at the edges of its knowledge when you can see the exact forum posts you are looking at showing up verbatim in it's responses.

I think some industries with mostly proprietary code will be a bit disappointing to use AI within.

internet2000•1h ago

I got Opus 4.6 to one shot it, took 5-ish mins. "Write me a python program that outputs an svg of a fleur-de-lis. Use freely available images to double check your work."

It basically just re-created the wikipedia article fleur-de-lis, which I'm not sure proves anything beyond "you have to know how to use LLMs"

robertcope•33m ago

Same, I used Sonnet 4.6 with the prompt, "Write a simple program that displays a fleur-de-lis. Python is a good language for this." Took five or six minutes, but it wrong a nice Python TK app that did exactly what it was supposed to.

flerchin•2h ago

Yes plausible text prediction is exactly what it is. However, I wonder if the author included benchmarking in their prompt. It's not exactly fair to keep hidden requirements.

g947o•2h ago

Attributing these to "hidden requirements" is a slippery slope.

My own experience using Claude Code and similar tools tells me that "hidden requirements" could include:

* Make sure DESIGN.md is up to date

* Write/update tests after changing source, and make sure they pass

* Add integration test, not only unit tests that mock everything

* Don't refactor code that is unrelated to the current task

...

These are not even project/language specific instructions. They are usually considered common sense/good practice in software engineering, yet I sometimes had to almost beg coding agents to follow them. (You want to know how many times I have to emphasize don't use "any" in a TypeScript codebase?)

People should just admit it's a limitation of these coding tools, and we can still have a meaningful discussion.

flerchin•2h ago

Yeah I agree generally that the most banal things must be specified, but I do think that a single sentence in the prompt "Performance should be equivalent" would likely have yielded better results.

lukeify•2h ago

Most humans also write plausible code.

tartoran•2h ago

LLMs piggyback on human knowledge encoded in all the texts they were trained on without understanding what they're doing.

Humans would execute that code and validate it. From plausible it'd becomes hey, it does this and this is what I want. LLMs skip that part, they really have no understanding other than the statistical patterns they infer from their training and they really don't need any for what they are.

owlninja•2h ago

They probably at least look at the docs?

stevenhuang•54m ago

LLMs can execute code and validate it too so the assertions you've made in your argument are incorrect.

What a shame your human reasoning and "true understanding" led you astray here.

FrankWilhoit•2h ago

Enterprise customers don't buy correct code, they buy plausible code.

kibwen•2h ago

Enterprise customers don't buy plausible code, they buy the promise of plausible code as sold by the hucksters in the sales department.

marginalia_nu•2h ago

I think SolarWinds would have preferred correct code back in 2020.

qup•2h ago

Okay, but what did they buy?

marginalia_nu•2h ago

Code, from their employees.

2god3•1h ago

They're not buying code.

They are buying a service. As long as the service 'works' they do not care about the other stuff. But they will hold you liable when things go wrong.

The only caveat is highly regulated stuff, where they actually care very much.

cat_plus_plus•2h ago

That's very impressive. Your LLM actually wrote a correct code for a full relational database on the first try, like it takes 2.5 seconds to insert 100 rows but it stores them correctly and select is pretty fast. How many humans can do this without a week of debugging? I would suggest you install some profiling tools and ask it to find and address hotspots. SQL Lite had how long and how many people to get to where it is?

bluefirebrand•2h ago

I could "write" this code the same way, it's easy

Just copy and paste from an open source relational db repo

Easy. And more accurate!

snoob2021•2h ago

It is a Rust reimplementation of SQLite. Not exactly just "copy and paste"

cat_plus_plus•2h ago

The actual task is usually to mix something that looks like a dozen of different open source repos combined but to take just the necessary parts for task at hand and add glue / custom code for the exact thing being built. While I could do it, LLM is much faster at it, and most importantly I would not enjoy the task.

comex•2h ago

Based on a search, the SQLite reimplementation in question is Frankensqlite, featured on Hacker News a few days ago (but flagged):

https://news.ycombinator.com/item?id=47176209

mmaunder•2h ago

But my AI didn't do what your AI did.

Cherry picked AI fail for upvotes. Which you’ll get plenty of here an on Reddit from those too lazy to go and take a look for themselves.

Using Codex or Claude to write and optimize high performance code is a game changer. Try optimizing cuda using nsys, for example. It’ll blow your lazy little brain.

oofbey•2h ago

It’s easy to get AI to write bad code. Turns out you still need coding skills to get AI to write good code. But those who have figured it out can crank out working systems at a shocking pace.

serious_angel•2h ago

I am sorry for asking, but... is there guide even on how to "figure it out"? Otherwise, how are you so sure about it?

mmaunder•1h ago

That's actually a great question. Truth be told the best way right now is to grab Codex CLI or Claude CLI (I strongly prefer Codex, but Claude has its fans), and just start. Immediately. Then go hard for a few months and you'll develop the skills you need.

A few tips for a quickstart:

Give yourself permission to play.

Understand basic concepts like context window, compaction, tokens, chain of thought and reasoning, and so on. Use AI to teach you this stuff, and read every blog post OpenAI and Anthropic put out and research what you don't understand.

Pick a hard coding problem in Python or Typescript and take a leap of faith and ask the agent to code it for you.

My favorite phrase when planning is: "Don't change anything. Just tell me.". Save this as a tmux shortcut and use it at the end of every prompt when planning something out.

Use markdown .md docs to create a planning doc and keep chatting to the agent about it and have it update the plan until you're super happy, always using the magic phrase "Don't change anything. Just tell me." (I should get myself a patent on that little number. Best trick I know)

Every time you see an anti-AI post, just move on. It's lazy people making lazy assumptions. Approach agentic coding with a sense of love, excitement, optimism, and take massive leaps of faith and you'll be very very surprised at what you find.

Best of luck Serious Angel.

2god3•1h ago

You're not really answering the question are you?

Your answer is to play with it. Cool. But why cant you and others put together a proper guide lol? It cant be that hard.

Go ahead and do it - it'll challenge the Anti-AI posters you are referencing. I and others want to see that debate.

mmaunder•1h ago

Ah - I know! Seriously I know. There's such a bad need for this right now. The problem is that the folks who are great at agentic coding are coding their asses off 16 to 20 hours a day and don't have a minute they want to spend on writing guides because of the opportunity cost.

One of the rare resources I found recently was the OpenClaw guys interview on Lex. He drops a few bangers that are really valuable and will save you having to spend a long time figuring it out.

Also there's a very strong disincentive for anyone to write right now because we're competing against the noise and the slop in the space. So best to just shut the fuck up and create as fast as we can, and let the outcome speak for itself. You're going to see a lot more products like OpenClaw where the pace of innovation is rapid, and the author freely admits that they're coding agentically and not writing a single line.

I think the advantage that Peter has (openclaw author) is that he has enough money and success to not give a fuck about what people say re him writing purely agentically, so he's been very open about it which has been great for others who are considering doing the same.

But if you have a software engineering career or are a public figure with something to lose, you tend to STFU if you're doing pure agentic coding on a project.

But that'll change. Probably over the next few months. OpenClaw broke the ice.

appcustodian2•1h ago

Don't worry we'll all be taking the Claude certification courses soon enough

pornel•1h ago

When a new technology emerges we typically see some people who embrace it and "figure it out".

Electronic synthesisers went from "it's a piano, but expensive and sounds worse" to every weird preset creating a whole new genre of electronic music.

So it seems plausible, like Claude's code, that our complaints about unmaintainable code are from trying to use it like a piano, and the rave kids will find a better use for it.

appcustodian2•1h ago

How do you figure anything out? You go use it, a lot.

wmeredith•55m ago

Right here: https://codemanship.wordpress.com/2025/10/30/the-ai-ready-so...

This series of articles is gold.

Unsurprisingly, writing good software with AI follows the same principles as writing it without AI. Keep scopes small. Ship, refactor, optimize, and write tests as you go.

oofbey•18m ago

I am sure because I can crank out working systems at a shocking pace. As can some of my friends and coworkers.

mmaunder•1h ago

Agreed 100%. I'd add that it's the knowledge of architecture and scaling that you got from writing all that good code, shipping it, and then having to scale it. It gives you the vocabulary and broad and deep knowledge base to innovate at lightning speeds and shocking levels of complexity.

kccqzy•1h ago

Yeah right. A LLM in the hands of a junior engineer produces a lot of code that looks like they are written by juniors. A LLM in the hands of a senior engineer produces code that looks like they are written by seniors. The difference is the quality of the prompt, as well as the human judgement to reject the LLM code and follow-up prompts to tell the LLM what to write instead.

2god3•1h ago

Lol what. The difference is that the senior... is a senior. Ask yourself what characteristics comprises a senior vs junior...

You're glossing over so much stuff. Moreover, how does the Junior grow and become the senior with those characteristics, if their starting point is LLMs?

mmaunder•1h ago

I kind of agree. But I'd adjust that to say that in both cases you get good looking code. In the hands of a junior you get crappy architecture decisions and complete failure to manage complexity which results in the inevitable reddit "they degraded the model" post. In the hands of seniors you get well managed complexity, targeted features, scalable high performance architecture, and good base technology choices.

gzread•2h ago

Early LLMs would do better at a task if you prefixed the task with "You are an expert [task doer]"

serious_angel•2h ago

Holy gracious sakes... Of course... Thank you... thank you... dear katanaquant, from the depths... of my heart... There's still belief in accountability... in fun... in value... in effort... in purpose... in human... in art...

- <http://archive.today/2026.03.07-020941/https://lr0.org/blog/...> (I'm not consulting an LLM...)

- <https://web.archive.org/web/20241021113145/https://slopwatch...>

pornel•2h ago

Their default solution is to keep digging. It has a compounding effect of generating more and more code.

If they implement something with a not-so-great approach, they'll keep adding workarounds or redundant code every time they run into limitations later.

If you tell them the code is slow, they'll try to add optimized fast paths (more code), specialized routines (more code), custom data structures (even more code). And then add fractally more code to patch up all the problems that code has created.

If you complain it's buggy, you can have 10 bespoke tests for every bug. Plus a new mocking framework created every time the last one turns out to be unfit for purpose.

If you ask to unify the duplication, it'll say "No problem, here's a brand new metamock abstract adapter framework that has a superset of all feature sets, plus two new metamock drivers for the older and the newer code! Let me know if you want me to write tests for the new adapters."

stingraycharles•1h ago

> If you ask to unify the duplication, it'll say "No problem, here's a brand new metamock abstract adapter framework that has a superset of all feature sets, plus two new metamock drivers for the older and the newer code! Let me know if you want me to write tests for the new adapters."

Nevermind the fact that it only migrated 3 out of 5 duplicated sections, and hasn’t deleted any now-dead code.

vannevar•1h ago

I'd highly recommend working top down, getting it to outline a sane architecture before it starts coding. Then if one of the modules starts getting fouled up, start with a clean sheet context (for that module) incorporating any cautions or lessons learned from the bad experience. LLMs are not yet good at working and reworking the same code, for the reasons you outline. But they are pretty good at a "Groundhog Day" approach of going through the implementation process over and over until they get it right.

bryanrasmussen•1h ago

maybe there should be an LLM trained on a corpus of a deletions and cleanup of code.

krackers•8m ago

I'm guessing there's a very strong prior to "just keep generating more tokens" as opposed to deleting code that needs to be overcome. Maybe this is done already but since every git project comes with its own history, you could take a notable open-source project (like LLVM) and then do RL training against against each individual patch committed.

unlikelytomato•1h ago

This is why I'm confused when people say it isn't ready to replace most of the programmer workforce.

esafak•1h ago

I have run into this too. Some of it is because models lack the big picture; so called agentic search (aka grep) is myopic.

marginalia_nu•1h ago

My sense is that the code generation is fast, but then you always need to spend several hours making sure the implementation is appropriate, correct, well tested, based on correct assumptions, and doesn't introduce technical debt.

You need to do this when coding manually as well, but the speed at which AI tools can output bad code means it's so much more important.

LPisGood•31m ago

And it’s slower to review because you didn’t do the hard part of understanding the code as it was being written.

ehnto•18m ago

Well when you write it manually you are doing the review and sanity checking in real time. For some tasks, not all but definitely difficult tasks, the sanity checking is actually the whole task. The code was never the hard part, so I am much more interested in the evolving of AIs real world problem solving skills over code problems.

I think programming is giving people a false impression on how intelligent the models are, programmers are meant to be smart right so being able to code means the AI must be super smart. But programmers also put a huge amount of their output online for free, unlike most disciplines, and it's all text based. When it comes to problem solving I still see them regularly confused by simple stuff, having to reset context to try and straighten it out. It's not a general purpose human replacement just yet.

MattGaiser•25m ago

> If they implement something with a not-so-great approach, they'll keep adding workarounds or redundant code every time they run into limitations later.

Are you using plan mode? I used to experience the do a poor approach and dig issue, but with planning that seems to have gone away?

Implicated•11m ago

Not trying to be snarky, with all due respect... this is a skill issue.

It's a tool. It's a wildly effective and capable tool. I don't know how or why I have such a wildly different experience than so many that describe their experiences in a similar manner... but... nearly every time I come to the same conclusion that the input determines the output.

> If they implement something with a not-so-great approach, they'll keep adding workarounds or redundant code every time they run into limitations later.

Yes, when the prompt/instructions are overly broad and there's no set of guardrails or guidelines that indicate how things should be done... this will happen. If you're not using planning mode, skill issue. You have to get all this stuff wrapped up and sorted before the implementation begins. If the implementation ends up being done in a "not-so-great" approach - that's on you.

> If you tell them the code is slow

Whew. Ok. You don't tell it the code is slow. Do you tell your coworker "Hey, your code is slow" and expect great results? You ask it to benchmark the code and then you ask it how it might be optimized. Then you discuss those options with it (this is where you do the part from the previous paragraph, where you direct the approach so it doesn't do "no-so-great approach") until you get to a point where you like the approach and the model has shown it understands what's going on.

Then you accept the plan and let the model start work. At this point you should have essentially directed the approach and ensured that it's not doing anything stupid. It will then just execute, it'll stay within the parameters/bounds of the plan you established (unless you take it off the rails with a bunch of open ended feedback like telling it that it's buggy instead of being specific about bugs and how you expect them to be resolved).

> you can have 10 bespoke tests for every bug. Plus a new mocking framework created every time the last one turns out to be unfit for purpose.

This is an area I will agree that the models are wildly inept. Someone needs to study what it is about tests and testing environments and mocking things that just makes these things go off the rails. The solution to this is the same as the solution to the issue of it keeping digging or chasing it's tail in circles... Early in the prompt/conversation/message that sets the approach/intent/task you state your expectations for the final result. Define the output early, then describe/provide context/etc. The earlier in the prompt/conversation the "requirements" are set the more sticky they'll be.

And this is exactly the same for the tests. Either write your own tests and have the models build the feature from the test or have the model build the tests first as part of the planned output and then fill in the functionality from the pre-defined test. Be very specific about how your testing system/environment is setup and any time you run into an issue testing related have the model make a note about that and the solution in a TESTING.md document. In your AGENTS.md or CLAUDE.md or whatever indicate that if the model is working with tests it should refer to the TESTING.md document for notes about the testing setup.

Personally, I focus on the functionality, get things integrated and working to the point I'm ready to push it to a staging or production (yolo) environment and _then_ have the model analyze that working system/solution/feature/whatever and write tests. Generally my notes on the testing environment to the model are something along the lines of a paragraph describing the basic testing flow/process/framework in use and how I'd like things to work.

The more you stick to convention the better off you'll be. And use planning mode.

skybrian•2h ago

You can ask an LLM to write benchmarks and to make the code faster. It will find and fix simple performance issues - the low-hanging fruit. If you want it to do better, you can give it better tools and more guidance.

It's probably a good idea to improve your test suite first, to preserve correctness.

jqpabc123•1h ago

LLMs have no idea what "correct" means.

Anything they happen to get "correct" is the result of probability applied to their large training database.

Being wrong will always be not only possible but also likely any time you ask for something that is not well represented in it's training data. The user has no way to know if this is the case so they are basically flying blind and hoping for the best.

Relying on an LLM for anything "serious" is a liability issue waiting to happen.

tonypapousek•1h ago

It’s a shame of bulk of that training data is likely 2010s blogspam that was poor quality to begin with.

2god3•1h ago

But isn't that a reflection of reality?

If you've made a significant investment in human capital, you're even more likely to protect it now and prevent posting valuable stuff on the web.

2god3•1h ago

Aye. I wish more conversations would be more of this nature - in that we should start with basic propositions - e.g. the thing does not 'know' or 'understand' what correct is.

LarsDu88•1h ago

This is about to change very soon. Unlike many other domains (such as greenfield scientific discovery), most coding problems for which we can write tests and benchmarks are "verifiable domains".

This means an LLM can autogenerated millions of code problem prompts, attempt millions of solutions (both working and non-working), and from the working solutions, penalize answers that have poor performance. The resulting synthetic dataset can then be used as a finetuning dataset.

There are now reinforcement finetuning techniques that have not been incorporated into the existing slate of LLMs that will enable finetuning them for both plausibility AND performance with a lot of gray area (like readability, conciseness, etc) in between.

What we are observing now is just the tip of a very large iceberg.

2god3•1h ago

Lets suppose whatever you say is true.

If Im the govt, Id be foaming at the mouth - those projects that used to require enormous funding now will supposedly require much less.

Hmmm, what to do? Oh I know. Lets invest in Digital ID-like projects. Fun.

ontouchstart•1h ago

I made a comment in another thread about my acceptance criteria

https://news.ycombinator.com/item?id=47280645

It is more about LLMs helping me understand the problem than giving me over engineered cookie cutter solutions.

graphememes•1h ago

bad input > bad output

idk what to say, just because it's rust doesn't mean it's performant, or that you asked for it to be performant.

yes, llms can produce bad code, they can also produce good code, just like people

codethief•1h ago

> Your LLM Doesn't Write Correct Code. It Writes Plausible Code.

I don't always write correct code, either. My code sure as hell is plausible but it might still contain subtle bugs every now and then.

In other words: 100% correctness was never the bar LLMs need to pass. They just need to come close enough.

raw_anon_1111•1h ago

The difference for me recently

Write a lambda that takes an S3 PUT event and inserts the rows of a comma separated file into a Postgres database.

Naive implementation: download the file from s3 and do a bulk insert - it would have taken 20 minutes and what Claude did at first.

I had to tell it to use the AWS sql extension to Postgres that will load a file directly from S3 into a table. It took 20 seconds.

I treat coding agents like junior developers.

svpyk•1h ago

Unlike junior developers, llms can take detailed instructions and produce outstanding results at first shot a good number of times.

conception•26m ago

Did you ask it to research best practices for this method, have an adversarial performance based agent review their approach or search for performant examples of the task first? Relying on training data only will always get your subpar results. Using “What is the most performant way to load a CSV from S3 into PostgreSQL on RDS? Compare all viable and research approaches before recommending one.” gave me the extension as the top option.

D-Machine•6m ago

This article is great. And the blog-article headline is interesting, but wrong. LLM's don't in general write plausible code (as a rule) either.

They just write code that is (semantically) similar to code (clusters) seen in its training data, and which haven't been fenced off by RLHF / RLVR.

This isn't that hard to remember, and is a correct enough simplification of what generative LLMs actually do, without resorting to simplistic or incorrect metaphors.

Plasma Bigscreen – 10-foot interface for KDE plasma

UUID package coming to Go standard library

this css proves me human

Can a wealthy family change the course of a deadly brain disease?

LLMs work best when the user defines their acceptance criteria first

Maybe There's a Pattern Here?

C# strings silently kill your SQL Server indexes in Dapper

Galileo's handwritten notes found in ancient astronomy text

Hardening Firefox with Anthropic's Red Team

Tell HN: I'm 60 years old. Claude Code has ignited a passion again

Querying 3B Vectors

Show HN: Moongate – Ultima Online server emulator in .NET 10 with Lua scripting

The Shady World of IP Leasing

Tech employment now significantly worse than the 2008 or 2020 recessions

AI and the Illegal War

Launch HN: Palus Finance (YC W26): Better yields on idle cash for startups, SMBs

Show HN: 1v1 coding game that LLMs struggle with

Show HN: Kula – Lightweight, self-contained Linux server monitoring tool

CT Scans of Health Wearables

What canceled my Go context?

Entomologists use a particle accelerator to image ants at scale

A Modular Robot Dashboard

Ada 2022

A tool that removes censorship from open-weight LLMs

Workers who love ‘synergizing paradigms’ might be bad at their jobs

Good Bad ISPs

Game about Data of America

Analytic Fog Rendering with Volumetric Primitives (2025)

Multifactor (YC F25) Is Hiring an Engineering Lead

Astra: An open-source observatory control software

LLMs work best when the user defines their acceptance criteria first

Comments

Plasma Bigscreen – 10-foot interface for KDE plasma

UUID package coming to Go standard library

this css proves me human

Can a wealthy family change the course of a deadly brain disease?

LLMs work best when the user defines their acceptance criteria first

Maybe There's a Pattern Here?

C# strings silently kill your SQL Server indexes in Dapper

Galileo's handwritten notes found in ancient astronomy text

Hardening Firefox with Anthropic's Red Team

Tell HN: I'm 60 years old. Claude Code has ignited a passion again

Querying 3B Vectors

Show HN: Moongate – Ultima Online server emulator in .NET 10 with Lua scripting

The Shady World of IP Leasing

Tech employment now significantly worse than the 2008 or 2020 recessions

AI and the Illegal War

Launch HN: Palus Finance (YC W26): Better yields on idle cash for startups, SMBs

Show HN: 1v1 coding game that LLMs struggle with

Show HN: Kula – Lightweight, self-contained Linux server monitoring tool

CT Scans of Health Wearables

What canceled my Go context?

Entomologists use a particle accelerator to image ants at scale

A Modular Robot Dashboard

Ada 2022

A tool that removes censorship from open-weight LLMs

Workers who love ‘synergizing paradigms’ might be bad at their jobs

Good Bad ISPs

Game about Data of America

Analytic Fog Rendering with Volumetric Primitives (2025)

Multifactor (YC F25) Is Hiring an Engineering Lead

Astra: An open-source observatory control software