frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: Argus – VSCode debugger for Claude Code sessions

https://github.com/yessGlory17/argus
26•lydionfinance•1h ago•8 comments

The Millisecond That Could Change Cancer Treatment

https://spectrum.ieee.org/flash-radiotherapy
28•marc__1•1h ago•8 comments

Ki Editor - an editor that operates on the AST

https://ki-editor.org/
237•ravenical•6h ago•71 comments

Show HN: ANSI-Saver – A macOS Screensaver

https://github.com/lardissone/ansi-saver
39•lardissone•2h ago•10 comments

SigNoz (YC W21, open source Datadog) Is Hiring across roles

https://signoz.io/careers
1•pranay01•4m ago

Compiling Prolog to Forth [pdf]

https://vfxforth.com/flag/jfar/vol4/no4/article4.pdf
15•PaulHoule•3d ago•0 comments

Plasma Bigscreen – 10-foot interface for KDE plasma

https://plasma-bigscreen.org
561•PaulHoule•17h ago•175 comments

The yoghurt delivery women combatting loneliness in Japan

https://www.bbc.com/travel/article/20260302-the-yoghurt-delivery-women-combatting-loneliness-in-j...
72•ranit•3h ago•49 comments

PC processors entered the Gigahertz era today in the year 2000 with AMD's Athlon

https://www.tomshardware.com/pc-components/cpus/pc-processors-entered-the-gigahertz-era-today-in-...
81•LorenDB•2h ago•47 comments

UUID package coming to Go standard library

https://github.com/golang/go/issues/62026
290•soypat•15h ago•185 comments

Filesystems Are Having a Moment

https://madalitso.me/notes/why-everyone-is-talking-about-filesystems/
57•malgamves•6h ago•18 comments

Self-Portrait by Ernst Mach (1886)

https://publicdomainreview.org/collection/self-portrait-by-ernst-mach-1886/
36•Hooke•1d ago•8 comments

Re-creating the complex cuisine of prehistoric Europeans

https://arstechnica.com/science/2026/03/recreating-the-complex-cuisine-of-prehistoric-europeans/
6•apollinaire•20h ago•0 comments

this css proves me human

https://will-keleher.com/posts/this-css-makes-me-human/
321•todsacerdoti•19h ago•100 comments

48x32, a 1536 LED Game Computer (2023)

https://jacquesmattheij.com/48x32-introduction/
46•duck•2d ago•10 comments

Tinnitus Is Connected to Sleep

https://www.sciencealert.com/tinnitus-is-somehow-connected-to-a-crucial-bodily-function
60•bookofjoe•2h ago•70 comments

Seurat Most Famous for Paris Park Painting Yet Half His Paintings Were Seascapes

https://www.smithsonianmag.com/smart-news/georges-seurat-is-most-famous-for-his-pointillist-paint...
8•bookofjoe•3d ago•1 comments

Uploading Pirated Books via BitTorrent Qualifies as Fair Use, Meta Argues

https://torrentfreak.com/uploading-pirated-books-via-bittorrent-qualifies-as-fair-use-meta/
252•askl•7h ago•146 comments

Helix: A post-modern text editor

https://helix-editor.com/
253•doener•17h ago•115 comments

Galileo's handwritten notes found in ancient astronomy text

https://www.science.org/content/article/galileo-s-handwritten-notes-found-ancient-astronomy-text
185•tzury•2d ago•34 comments

LLMs work best when the user defines their acceptance criteria first

https://blog.katanaquant.com/p/your-llm-doesnt-write-correct-code
350•dnw•15h ago•247 comments

Working and Communicating with Japanese Engineers

https://www.tokyodev.com/articles/working-and-communicating-with-japanese-engineers
92•zdw•4d ago•45 comments

Tell HN: I'm 60 years old. Claude Code has re-ignited a passion

825•shannoncc•17h ago•724 comments

QGIS 4.0

https://changelog.qgis.org/en/version/4.0/
155•jonbaer•8h ago•34 comments

Show HN: µJS, a 5KB alternative to Htmx and Turbo with zero dependencies

https://mujs.org
25•amaury_bouchard•8h ago•7 comments

Lock Scroll with a Vengeance

https://unsung.aresluna.org/lock-scroll-with-a-vengeance/
41•etothet•3d ago•11 comments

Show HN: Moongate – Ultima Online server emulator in .NET 10 with Lua scripting

https://github.com/moongate-community/moongatev2
270•squidleon•1d ago•154 comments

Migrating from Heroku to Magic Containers

https://bunny.net/blog/migrating-from-heroku-to-magic-containers/
20•pimterry•2d ago•7 comments

Compiling Match Statements to Bytecode

https://xnacly.me/posts/2026/compiling-match-statements-to-bytecode/
17•ingve•2d ago•2 comments

The Case of the Disappearing Secretary

https://rowlandmanthorpe.substack.com/p/the-case-of-the-disappearing-secretary
39•rwmj•3h ago•9 comments
Open in hackernews

LLM Doesn't Write Correct Code. It Writes Plausible Code

https://twitter.com/KatanaLarp/status/2029928471632224486
54•pretext•2h ago

Comments

treetalker•1h ago
This is my experience with how LLMs "draft" legal arguments: at first glance, it's plausible — but may be, and often is, invalid, unsound, and/or ill-advised.

The catch is that many judges lack the time, energy, or willingness to not only read the documents in detail, but also roll up their sleeves and dig into the arguments and cited authorities. (Some lack the skills, but those are extreme cases.) So the plausible argument (improperly and unfortunately) carries the day.

LLM use in litigation drafting is thus akin to insurgent/guerilla warfare: it take little time, energy, or thinking to create, yet orders of magnitude more to analyze and refute. (It's a species of Brandolini's Law / The Bullshit Asymmetry Principle.) Thus justice suffers.

I imagine that this is analogous to the cognitive, technical, and "sub-optimal code" debt that LLM-produced code is generating and foisting upon future developers who will have to unravel it.

FpUser•1h ago
>" justice suffers"

Possible. It also suffers when majority simply can not afford proper representation

deaux•1h ago
> This is my experience with how LLMs "draft" legal arguments: at first glance, it's plausible — but may be, and often is, invalid, unsound, and/or ill-advised.

Correct, and this of course extends past just laws, into the whole scope of rules and regulations described in human languages. It will by its nature imply things that aren't explicitly stated nor can be derived with certainty, just because they're very plausible. And those implications can be wrong.

Now I've had decent success with having LLMs then review these LLM-generated texts to flag such occurences where things aren't directly supported by the source material. But human review is still necessary.

The cases I've been dealing with are also based on relatively small sets of regulations compared the scope of the law involved with many legal cases. So I imagine that in the domain you're working on, much more needs flagging.

roarcher•37m ago
> LLM use in litigation drafting is thus akin to insurgent/guerilla warfare: it take little time, energy, or thinking to create, yet orders of magnitude more to analyze and refute.

The same goes for coding. I have coworkers who use it to generate entire PRs. They can crank out two thousand lines of code that includes tests "proving" that it works, but may or may not actually be nonsense, in minutes. And then some poor bastard like me has to spend half a day reviewing it.

When code is written by a human that I know and trust, I can assume that they at least made reasonable, if not always correct, decisions. I can't assume that with AI, so I have to scrutinize every single line. And when it inevitably turns out that the AI has come up with some ass-backwards architecture, the burden is on me to understand it and explain why it's wrong and how to fix it to the "developer" who hasn't bothered to even read his own PR.

I'm seriously considering proposing that if you use AI to generate a PR at my company, the story points get credited to the reviewer.

basch•16m ago
"Reasoning" needs to go back to the drawing board.

Reasonable tasks need to be converted into formal logic, calculated and computed like a standard evaluation, and then translated back into english or language of choice.

LLMs are being used to think when really they should be the interpret and render steps with something more deterministic in the middle.

Translate -> Reason -> Store to Database. Rinse Repeat. Now the context can call from the database of facts.

baal80spam•1h ago
dupe: https://news.ycombinator.com/item?id=47283337
dang•44m ago
Thanks! I'll merge the threads when I'm not on my phone.
seanmcdirmid•1h ago
Ok, I’ll bite: how is that different from humans?
strken•1h ago
Human behaviour is goal-directed because humans have executive function. When you turn off executive function by going to sleep, your brain will spit out dreams. Dream logic is famous for being plausible but unhinged.

I have the feeling that LLMs are effectively running on dream logic, and everything we've done to make them reason properly is insufficient to bring them up to human level.

whoamii•1h ago
Some of my best code comes from my dreams though.
satvikpendem•1h ago
A prompt for an LLM is also a goal direction and it'll produce code towards that goal. In the end, it's the human directing it, and the AI is a tool whose code needs review, same as it always has been.
basch•11m ago
Id argue humans have some sort of parallelness going on that machines dont yet. Thoughts happening at multiple abstraction levels simultaneously. As I am doing something, I am also running the continuous improvement cycle in my head, at all four steps concurrently. Is this working, is this the right direction, does this validate?

You could build layers and layers of LLMs watching the output of each others thoughts and offering different commentary as they go, folding all the thoughts back together at the end. Currently, a group of agents acts more like a discussion than something somewhat omnipotent or omnitemporal.

spiderfarmer•1h ago
And yet LLM’s are incredibly useful as they are right now.
nemo44x•1h ago
LLMs are literally goal machines. It’s all they do. So it’s important that you input specific goals for them to work towards. It’s also why logically you want to break the problem into many small problems with concrete goals.
andai•1h ago
Do you only mean instruct-tuned LLMs? Or the base (pretrained) model too?
nemo44x•2m ago
The entire system and the agent loop allows for more complex goal resolution. The LLM models language (obviously) and language is goal oriented so it models goal oriented language. It’s an emergent feature of the system.
tsunamifury•1h ago
It’s amazing how much you get wrong here. As LLM attention layers are stacked goal functions.

What they lack is multi turn long walk goal functions — which is being solved to some degree by agents.

seanmcdirmid•58m ago
Isn’t a modern LLM with thinking tokens fairly goal directed? But yes, we hallucinate in our sleep while LLMs will hallucinate details if the prompt isn’t grounded enough.
tovej•49m ago
Assuming this is not a rhetorical question: no, it is not. The only "goal" is to maximize plausibility.
seanmcdirmid•35m ago
Again, how is that different from humans? I’m not going around trying to prove my code correct when I write it manually.
zarzavat•48m ago
The thing about dream logic is that it can be a completely rational series of steps, but there's usually a giant plot hole which you only realise the second you wake up.

This definitely matches my experience of talking to AI agents and chatbots. They can be extremely knowledgeable on arcane matters yet need to have obvious (to humans) assumptions pointed out to them, since they only have book smarts and not street smarts.

wood_spirit•1h ago
It’s not. LLMs are just averaging their internet snapshot, after all.

But people want an AI that is objective and right. HN is where people who know the distinction hang out, but it’s not what the layperson things they are getting when they use this miraculous super hyped tool that everybody is raving about?

satvikpendem•1h ago
By now, a few years after ChatGPT released, I don't think anyone is thinking AI is objective and right, all users have seen at least one instance of hallucination and simply being wrong.
wood_spirit•1h ago
Sorry I can think of so many counter examples. I also detect a lot of “well it hallucinates about subject X (that the person knows well, so can spot the hallucination)” but continue to trust it on subjects Y and Z (which the person knows less well so can’t spot the hallucinations).

YMMV.

andai•1h ago
> Briefly stated, the Gell-Mann Amnesia effect works as follows. You open the newspaper to an article on some subject you know well. In Murray's case, physics. In mine, show business. You read the article and see the journalist has absolutely no understanding of either the facts or the issues. Often, the article is so wrong it actually presents the story backward-reversing cause and effect. I call these the "wet streets cause rain" stories. Paper's full of them. In any case, you read with exasperation or amusement the multiple errors in a story-and then turn the page to national or international affairs, and read with renewed interest as if the rest of the newspaper was somehow more accurate about far-off Palestine than it was about the story you just read. You turn the page, and forget what you know.

-Michael Crichton

satvikpendem•1h ago
Sure, Gell-Mann amnesia exists, but remember that its origin is actually human, in the form of newspaper writers. So, how can we trust humans the same way? In just the same way, AI cannot also be fully trusted.
wood_spirit•50m ago
The current way of doing AI cannot be trusted.

that doesn’t mean the future won’t herald a way of using what a transformer is good at - interfacing with humans - to translate to and interact with something that can be a lot more sound and objective.

satvikpendem•36m ago
You're falling into the extrapolation fallacy, there is no reason to think that the future won't have the same issues as today in terms of hallucinations.

And even if they were solved, how would that even work? The world is not sound and objective.

mrwh•59m ago
The etiquette, even at the bigtech place I work, has changed so quickly. The idea that it would be _embarrassing_ to send a code review with obvious or even subtle errors is disappearing. More work is being put on the reviewer. Which might even be fine if we made the further change that _credit goes to the reviewer_. But if anything we're heading in the opposite direction, lines of code pumped out as the criterion of success. It's like a car company that touts how _much_ gas its cars use, not how little.
wood_spirit•54m ago
Review is usually delegated to an AI too
seanmcdirmid•57m ago
There are a lot of binary thinkers on HN, but they shouldn’t make up a majority.
apical_dendrite•1h ago
The volume is different. Someone submitted a PR this week that was 3800 lines of shell script. Most of it was crap and none of it should have been in shell script. He's submitting PRs with thousands of lines of code every day. He has no idea how any of it actually works, and it completely overwhelms my ability to review.

Sure, he could have submitted a ill-considered 3800 line PR five years ago, but it would have taken him at least a week and there probably would have been opportunities to submit smaller chunks along the way or discuss the approach.

satvikpendem•1h ago
Just block that user, that seems to be the way.
switchbak•46m ago
It’s harder when the person doing what you describe has the ability to have you fired. Power asymmetry + irresponsible AI use + no accountability = a recipe for a code base going right to hell in a few months.

I think we’re going to see a lot of the systems we depend on fail a lot more often. You’d often see an ATM or flight staus screen have a BSOD - I think we’re going to see that kind of thing everywhere soon.

somewhereoutth•1h ago
Humans have a 'world model' beyond the syntax - for code, an idea of what the code should do and how it does it. Of course, some humans are better than others at this, they are recognized as good programmers.
satvikpendem•1h ago
Papers show that AI also has a world model, so I don't think that's the right distinction.
tovej•45m ago
Could you please cite these papers. If by AI you mean LLMs, that is not supported by what I know. If you mean a theoretical world-model-based AI, that's just a tautological statement.
satvikpendem•34m ago
https://arxiv.org/abs/2305.11169

https://arxiv.org/abs/2506.02996

rDr4g0n•1h ago
It's much easier to fire an employee which produces low quality/effort work than to convince leadership to fire Claude.
satvikpendem•53m ago
You can fire employees who don't review code generated though, because ultimately it's their responsibility to own their code, whether they hand wrote it or an LLM did.

It seems to me that it's all a matter of company culture, as it has always been, not AI. Those that tolerate bad code will continue to tolerate it, at their peril.

detourdog•38m ago
What I'm surprises me about the current development environment is the acceleration of technical debt. When I was developing my skills the nagging feeling that I didn't quite understand the technology was a big dark cloud. I felt this clopud was technical debt. This was always what I was working against.

I see current expectations that technical debt doesn't matter. The current tools embrace superficial understand. These tools to paper over the debt. There is no need for deeper understanding of the problem or solution. The tools take care of it behind the scenes.

bitwize•1h ago
You: Claude, do you know how to program?

Claude: No, but if you hum a few bars I can fake it!

Except "faking it" turns out to be good enough, especially if you can fake it at speed and get feedback as to whether it works. You can then just hillclimb your way to an acceptable solution.

andai•1h ago
Iterative Faking™ — now with plausible-looking test suite!
satvikpendem•1h ago
Oftentimes, plausible code is good enough, hence why people keep using AI to generate code. This is a distinction without a difference.
andai•1h ago
There appears to be a similar approach in UX... plausible user experience is close enough.
satvikpendem•59m ago
Yes, especially because in UX there is no "correct" approach to it, it's all relative.
bluetomcat•59m ago
No. Plausible code is syntactically-correct BS disguised as a solution, hiding a countless amount of weird semantic behaviours, invariants and edge cases. It doesn't reflect a natural and common-sense thought process that a human may follow. It's a jumble of badly-joined patterns with no integral sense of how they fit together in the larger conceptual picture.
satvikpendem•57m ago
Why do people keep insisting that LLMs don't follow a chain of reasoning process? Using the latest LLMs you can see exactly what they "think" and see the resultant output. Plausible code does not mean random code as you seem to imply, it means...code that could work for this particular situation.
tovej•42m ago
Because they don't. The chain-of-reasoning feature is really just a way to get the LLM to prompt more.

The fact that it generates these "thinking" steps does not mean it is using them for reasoning. It's most useful effect is making it seem to a human that there is a reasoning process.

satvikpendem•33m ago
How would you determine humans have reasoning then, in a way that LLMs do not?
seba_dos1•26m ago
I love how generating strings like "let me check my notes" is effective at ending up with somewhat better end results - it pushes the weights towards outputting text that appears to be written by someone who did check their notes :D
andai•1h ago
It writes statistically represented code, which is why (unless instructed otherwise) everything defaults to enterprisey, OOP, "I installed 10 trendy dependencies, please hire me" type code.
siliconc0w•42m ago
Just a recent anecdote, I asked the newest Codex to create a UI element that would persist its value on change. I'm using Datastar and have the manual saved on-disk and linked from the AGENTS.md. It's a simple html element with an annotation, a new backend route, and updating a data model. And there are even examples of this elsewhere in the page/app.

I've asked it to do why harder things so I thought it'd easily one-shot this but for some reason it absolutely ate it on this task. I tried to re-prompt it several times but it kept digging a hole for itself, adding more and more in-line javascript and backend code (and not even cleaning up the old code).

It's hard to appreciate how unintuitive the failure modes are. It can do things probably only a handful of specialists can do but it can also critical fail on what is a straightforward junior programming task.

maremmano•40m ago
this won't age well.
seba_dos1•31m ago
s/code/stuff/
jswelker•27m ago
I also write plausible code. Not much of a moat.