frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
391•klaussilveira•5h ago•85 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
750•xnx•10h ago•459 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
118•dmpetrov•5h ago•49 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
131•isitcontent•5h ago•14 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
234•vecti•7h ago•113 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
28•quibono•4d ago•2 comments

A century of hair samples proves leaded gas ban worked

https://arstechnica.com/science/2026/02/a-century-of-hair-samples-proves-leaded-gas-ban-worked/
57•jnord•3d ago•3 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
302•aktau•11h ago•152 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
304•ostacke•11h ago•82 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
160•eljojo•8h ago•121 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
377•todsacerdoti•13h ago•214 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
44•phreda4•4h ago•7 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
305•lstoll•11h ago•230 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
100•vmatsiiako•10h ago•34 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
167•i5heu•8h ago•127 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
138•limoce•3d ago•76 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
223•surprisetalk•3d ago•29 comments

FORTH? Really!?

https://rescrv.net/w/2026/02/06/associative
36•rescrv•12h ago•17 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
956•cdrnsf•14h ago•413 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
8•gfortaine•2h ago•0 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
7•kmm•4d ago•0 comments

Evaluating and mitigating the growing risk of LLM-discovered 0-days

https://red.anthropic.com/2026/zero-days/
33•lebovic•1d ago•11 comments

I'm going to cure my girlfriend's brain tumor

https://andrewjrod.substack.com/p/im-going-to-cure-my-girlfriends-brain
30•ray__•1h ago•6 comments

Claude Composer

https://www.josh.ing/blog/claude-composer
97•coloneltcb•2d ago•68 comments

The Oklahoma Architect Who Turned Kitsch into Art

https://www.bloomberg.com/news/features/2026-01-31/oklahoma-architect-bruce-goff-s-wild-home-desi...
17•MarlonPro•3d ago•2 comments

Show HN: Smooth CLI – Token-efficient browser for AI agents

https://docs.smooth.sh/cli/overview
76•antves•1d ago•56 comments

Show HN: Slack CLI for Agents

https://github.com/stablyai/agent-slack
37•nwparker•1d ago•8 comments

How virtual textures work

https://www.shlom.dev/articles/how-virtual-textures-really-work/
23•betamark•12h ago•22 comments

Evolution of car door handles over the decades

https://newatlas.com/automotive/evolution-car-door-handle/
38•andsoitis•3d ago•61 comments

The Beauty of Slag

https://mag.uchicago.edu/science-medicine/beauty-slag
27•sohkamyung•3d ago•3 comments
Open in hackernews

Do variable names matter for AI code completion? (2025)

https://yakubov.org/blogs/2025-07-25-variable-naming-impact-on-ai-code-completion
54•yakubov_org•6mo ago

Comments

yakubov_org•6mo ago
When GitHub Copilot suggests your next line of code, does it matter whether your variables are named "current_temperature" or just "x"?

I ran an experiment to find out, testing 8 different AI models on 500 Python code samples across 7 naming styles. The results suggest that descriptive variable names do help AI code completion.

Full paper: https://www.researchsquare.com/article/rs-7180885/v1

amelius•6mo ago
Shouldn't the LLMs therefore train on code where the variable names have been randomized?

Perhaps it will make them more intelligent ...

jerf•6mo ago
No. Variable names contain valuable information. That's why humans use them too.

AIs are finite. If they're burning brainpower on determining what "x" means, that's brainpower they're not burning on your actual task. It is no different than for humans. Complete with all the considerations about them being wrong, etc.

amelius•6mo ago
Training them on randomized var names doesn't mean you should do it deliberately during inference ...

Also, I think this is anthropomorphizing the llms a bit too much. They are not humans, and I'd like to see an experiment on how well they perform when trained with randomized var names.

jerf•6mo ago
Neither "variable names contain valuable information" nor "AIs are finite" are anthropomorphization. That variable names contain information is not only relative to some sort of human cognition; they objectively, mathematically, contain information. Remove that information and the AI is going to have to reconstruct it (or at least most of it), and as finite things, that's going to cost them because nothing like that is free for finite things. None of my assertions here depend on the humanity, the AI-ness, the Vulcan-ness, or any other conceivable finite intellectual architecture of any coding agent. It only depends on them being finite.
amelius•6mo ago
Let's stop with the comparison to humans, I'm more interested in why it would hurt to train LLMs with harder puzzles. Isn't that what we're doing all the time when training llms? I'm just suggesting an easy way to construct new puzzles: just randomize the varnames.
recursive•6mo ago
An even easier way to construct new puzzles is to fully randomize the problem statements and intended solutions.

When you take out the information from the variable names, you're making the training data farther from real-world data. Practicing walking on your hands, while harder than walking on your feet, won't make you better at hiking. In fact, if you spend your limited training resources on it, the opportunity cost might make you worse.

JambalayaJimbo•6mo ago
But we know that variable names do not matter whatsoever to a compiler. Now I do agree with you intuitively that LLMs perform better on meaningful variable names, without looking at hard data - but I don't think it has anything to do with "brainpower" - I just think your input data is more likely to resemble training data with meaningful variable names.
socalgal2•6mo ago
I think you're just arguing semantics. It seems intuitively obvious that if I have some simple physics code

    newPosition = currentPos + velocity * deltaTime
and change it to

    addressInChina = weightByGold + numberOfDogsOwned * birdPopulationInFrance
that both a human and likely an LLM will struggle to understand the code and do the the right thing. The thing we're discussion is does the LLM struggle. No one cares if that's not literally "brain" power. All they care about is does the LLM do a better, worse, or the same

> I just think your input data is more likely to resemble training data with meaningful variable names.

Based on giving job interviews, cryptic names are common.

_0ffh•6mo ago
Exactly because the task is harder if the variable name does not contain any information is what makes training like that a good idea. It forces the LLM to pay attention to the actual code to get it right, which in training is a Good Thing (TM).
datameta•6mo ago
Right. Inherent information complexity goes down as all metadata is stripped from the variable name and the value has to be re-contextualized in a fresh logical chain every time.
JambalayaJimbo•6mo ago
LLMs do not have brains and there is no evidence as far as I know that they "think" like human beings do.
gnulinux•6mo ago
LLMs do not reason at all (i.e. deductive reasoning using a formal system). Chain of thought etc simulate reasoning by smoothing out the path to target tokens by adding shorter stops on the way.
ACCount36•6mo ago
Do you reason? "LLMs do not reason at all" casts that into doubt immediately.
appreciatorBus•6mo ago
That being true does not mean that there are no limits to whatever it might be doing, which might be wasted with ambiguous naming schemes.

I am far from an AI booster or power user but in my experience, I get much better results with descriptive identifier names.

ACCount36•6mo ago
LLMs are only capable of performing a finite amount of computation within a single forward pass. We know that much.

They are also known to operate on high level abstracts and concepts - unlike systems operating strictly on formal logic, and very much like humans.

fenomas•6mo ago
LLMs do see randomized identifiers, whenever they encounter minimized code. And you can get a bit of an idea how much they learn, by giving an LLM some minimized JS and asking it to restore it with meaningful var names.

When I tried it once the model did a surprisingly good job, though it was quite a while ago and with a small model by today's standards.

knome•6mo ago
if you train them on randomized names, they'll also suggest them.

better to not, I think.

dingnuts•6mo ago
No, they're more likely to predict the correct next token the closer the code is to identical to the training set, so if you're doing something generic short names will get the right predictions and if you're doing something in a problem domain, using an input that starts the sequence generation in a part of the model that was trained on the problem domain is going to be better
empath75•6mo ago
They're trained on plenty of code with bad variable names.

But every time you make an AI think you are introducing an opportunity for it to make a mistake.

ssalka•6mo ago
The names of variables impart semantic meaning, which LLMs can pick up on and use as context for determining how variables should behave or be used. Seems obvious to me that `current_temperature` is a superior name to `x` – that is, unless we're doing competitive programming ;)
yakubov_org•6mo ago
My first hypothesis was that shorter variable names would use fewer tokens and be better for context utilisation and inference speed. I would expand your competitive programming angle to the obfuscated C challenge ;)
Macha•6mo ago
The problem is, unless you're doing green field development, that description of what the existing desired functionality is has to be somewhere, and I suspect a parallel markdown requirements documents and the code with golfed variable names are going to require more context, not less.
Groxx•6mo ago
Obviously yes. They all routinely treat my "thingsByID" array like a dictionary - it's a compact array where ID = index though.

They even screw that up inside the tiny function that populates it. If anything IMO, they over-value names immensely (which makes sense, given how they work, and how broadly consistent programmers are with naming).

gnulinux•6mo ago
Do you still have this problem if you add a comment before declaring the variable like "Note: thingsById is not a dictionary, it is an array. Each index of the array represents a blabla id that maps to a thing"

In my experience they under overvalue var names, but they value comments even more. So I tend to calibrate these things with more detailed comments.

bluefirebrand•6mo ago
Can't you just write the code instead of the more detailed comments? What is the benefit of this approach?
lazide•6mo ago
Have you ever been vacuuming, and ran across a little thing on the floor that refuses to get picked up (a small piece of metal or whatever), and spent 5x more time trying to get the vacuum to work than it would have taken to just pick it up and throw it away?

And then eventually reach down and pick it up - to feed it to the vacuum?

That is what this reminds me of.

gnulinux•6mo ago
You can fix the code of course. I just experiment with what sort of comments produce better code. In my experience, heavily commented code is handled by LLMs significantly better. So the total quality of comments eventually add up, if you're planning to use an LLM in a project in the long term, it pays off to comment it for LLM's context.
DullPointer•6mo ago
Curious if you get better results with something like “thingsByIdx” or “thingsByIndex,” etc.?
delifue•6mo ago
Did you add the type annotation of it in code?
Groxx•6mo ago
This is in Go, so both "yes" (it's defined with an explicit type in the file, sometimes the same func) and "yes but" (afaict next to no code-agent looks at type information that e.g. gopls has readily available, or even godoc).
partdavid•6mo ago
I get what you're saying, but what's interesting to me is that this case is a mild signal that a subsequent developer could take the same erroneous implication. "Id" does in fact imply to me that entries are indexed by "Id", i.e., an attribute of the item being indexed, and that they are not array-like, in that they wouldn't all get different IDs by a deletion, for example.
robertclaus•6mo ago
Nice to see actual data!
OutOfHere•6mo ago
Section names (as a comment) help greatly in long functions. Section names can also help partially compensate for some of the ambiguity of variable names.

Another thing that matters massively in Python is highly accurate, clear, and sensible type annotations. In contrast, incorrect type annotations can throw-off the LLM.

r0s•6mo ago
The purpose of code is for humans to read.

Until AI is compiling straight to machine language, code needs to be readable.

deadbabe•6mo ago
Variable names don’t matter in small scopes.
r0s•6mo ago
The scope of the cognitive effort is the total context of the system. Yes it matters.
rented_mule•6mo ago
It certainly can matter in any scope. `x` or even `delay` will lead to more bugs down the line than `delay_in_milliseconds`. It can be incredibly frustrating to debug why `delay = 1` does not appear to lead to a delay if your first impression is that `delay` (or `x`) is in seconds.
deadbabe•6mo ago
You will have exactly the same problem if delay_in_milliseconds is actually misnamed and the delay is measured in seconds.

Comments lie. Names lie. Code is the only source of truth.

fwip•6mo ago
No - "delay_in_milliseconds" will let you find the error and resolve it faster. With the less descriptive name, you need to notice the mismatch between the definition and the use site, which are further apart in context. Imagine you see in your debugger: "delay_in_milliseconds: 3" in your HttpTimeout - you'll instantly know that's wrong.

If you believe your reductive argument, your function and variable names would all be minimally descriptive, right?

deadbabe•6mo ago
For your specific example, there would never be a “delay in milliseconds” variable in the first place. That’s just throat clearing.

“sleep 1” is the complete expression. Because sleep takes a parameter measured in seconds, it’s already understood.

You do not need “delay_in_seconds = 1” and then a separate “sleep delay_in_seconds”. That accomplishes nothing, you might as well add a comment like “//seconds” if you want some kind of clarity.

rented_mule•6mo ago
Years later, when all memory of intent is long gone, I'd much rather work on a large code base that errs on the side of too much "throat clearing" than one that errs on the side too little. `sleep 1` tells what was written, which may or may not match intent.

Many bugs come from writing something that does not match intent. For example, someone writes most of their code in another language where `sleep` takes milliseconds, they meant to check the docs when they wrote it in this language, but the alarm for the annual fire drill went off just as they were about to check. So it went in as `sleep 1000` in a branch of the code that only runs occasionally. Years later, did they really mean 16 minutes and 40 seconds, or did they mean 1 second?

Leaving clues about intent helps detect such issues in review and helps debug the problems that slip through review. Comments are better than nothing, but they are easier to ignore than variable names.

deadbabe•6mo ago
If the code isn’t working, then intent doesn’t matter. The code was wrong.

If the code is working, the intent also doesn’t matter, what was written is what was intended.

Do the requirements call for an alarm of 16 minutes 40 seconds? Then leave the code be. If not, just change it.

xigoi•6mo ago
The only correct way is

    delay = Duration.milliseconds(1)
qwertytyyuu•6mo ago
lol why is SCREAM_SNAKE_CASE out performing
michaelhoney•6mo ago
yeah, Claude's just gonna have to deal with regular_snake_case from me
nemo1618•6mo ago
Time for Hungarian notation to make a comeback? I've always felt it was unfairly maligned. It would probably give LLMs a decent boost to see the type "directly" rather than needing to look up the type via search or tool call.
socalgal2•6mo ago
It was and still is

https://www.joelonsoftware.com/2005/05/11/making-wrong-code-...

Types help but they don't help "at a glance". In editors that have type info you have to hover over variables or look elsewhere in the code (even if it's up several lines) to figure out what you're actually looking at. In "app" hungarian this problem goes away.

hmry•6mo ago
I remember thinking this post was outdated when I first read it.

"Safe strings and unsafe strings have the same type - string - so we need to give them different naming conventions." I thought "Surely the solution is to give them different types instead. We have a tool to solve this, the type system."

"Operator overloading is bad because you need to read the entire code to find the declaration of the variable and the definition of the operator." I thought "No, just hit F12 to jump to definition. (Also, doesn't this apply to methods as well, not just operators?) We have a tool to solve this, the IDE."

If it really does turn out that the article's way is making a comeback 20 years later... How depressing would that be? All those advances in compilers and language design and editors thrown out, because LLMs can't use them?

selimthegrim•6mo ago
I wonder if LLMs grok multiple dispatch
k__•6mo ago
It's kinda funny that people are now taking decades of good coding practices seriously now that they work with AI instead of humans.
roxolotl•6mo ago
I was talking to a coworker about how they get the most out of Claude Code and they just went on to list every best practice they've never been willing to implement when working previously. For some reason people are willing to produce design documentation, provide comments that explain why, write self documenting code and so on now that they are using LLMs to generate code.

It's the same with the articles about how to work with these tools. A long list of coding best practices followed by a totally clueless "wow once I do all the hard work LLMs generate great code every time!"

nzach•6mo ago
> For some reason people are willing to produce design documentation....

I'm assuming you wrote that just for dramatic effect but let me explain why I think this behavior is completely rational.

If you implement "feature X" you already learned everything you needed, so adding documentation is a task that doesn't bring you any benefits. You could argue that it would make your life easier when you have to do some maintenance in this code, but that is a pretty big time investment for something you may use some day.

But now with LLMs that reasoning changes dramatically. Having good documentation makes your life easier right now. And the same argument can be made for every good practice that people never bothered to follow: commit messages, tests, variable naming, ...

For example, where I work the 'developer experience' team created a bot that reads your merge requests and judges your changes to understand if they are small enough to bypass the need to have an explicit approval from a coworker. And one of the things that it takes into account is how well documented is this change. If you have a small change without any context the bot won't approve your MR, but you explain the bug and add some relevant tests it will approve your MR and allow you to skip the human code review.

And the result of this new bot is that people are starting to better document their changes, because this allows them to work faster.

So, I agree with GP that is funny to see this play out. But it should not be a surprising behavior for anyone that understand how software is written.

kingstnap•6mo ago
"Context engineering" + "Prompt Engineering":

1. Having clear requirements with low ambiguity. 2. Giving a few input output pairs on how something should work (few shot prompting). 3. Avoiding providing useless information. Be consicise. 4. Avoid having contradictory information or distractors. 5. Break complex problems into more manageable pieces. 6. Provide goals and style guides.

A.K.A its just good engineering.

quuxplusone•6mo ago
"500 code samples generated by Magistral-24B" — So you didn't use real code?

The paper is totally mum on how "descriptive" names (e.g. process_user_input) differ from "snake_case" names (e.g. process_user_input).

The actual question here is not about the model but merely about the tokenizer: is it the case that e.g. process_user_input encodes into 5 tokens, ProcessUserInput into 3, and calcpay into 1? If you don't break down the problem into simple objective questions like this, you'll never produce anything worth reading.

ijk•6mo ago
True - though in the actual case of your examples, calcpay, process_user_input, and ProcessUserInput all encode into exactly 3 tokens with GPT-4.

Which is the exact kind of information that you want to know.

It is very non-obvious which one will use more tokens; the Gemma tokenizer has the highest variance with process|_|user|_|input = 5 tokens and Process|UserInput as 2 tokens.

In practice, I'd expect the performance difference to be relatively minimal, as input tokens tends to quickly get aggregated into more general concepts. But that's the kind of question that's worth getting metrics on: my intuition suggests one answer, but do the numbers actually hold up when you actually measure it?

quuxplusone•6mo ago
Awesome! You should have written this blog post instead of that guy. :)
Sohcahtoa82•6mo ago
It'd be interesting to see another result:

Adversarially named variables. As in, variables that are named something that is deliberately wrong and misleading.

    import json as csv
    close = open
    with close("dogs.yaml") as socket:
        time = csv.loads(socket.read())
    for sqlite3 in time:
        # I dunno, more horrifying stuff