frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Start all of your commands with a comma (2009)

https://rhodesmill.org/brandon/2009/commands-with-comma/
233•theblazehen•2d ago•68 comments

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
695•klaussilveira•15h ago•206 comments

Hoot: Scheme on WebAssembly

https://www.spritely.institute/hoot/
7•AlexeyBrin•1h ago•0 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
962•xnx•20h ago•555 comments

How we made geo joins 400× faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
130•matheusalmeida•2d ago•35 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
67•videotopia•4d ago•6 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
54•jesperordrup•5h ago•25 comments

ga68, the GNU Algol 68 Compiler – FOSDEM 2026 [video]

https://fosdem.org/2026/schedule/event/PEXRTN-ga68-intro/
11•matt_d•3d ago•2 comments

Jeffrey Snover: "Welcome to the Room"

https://www.jsnover.com/blog/2026/02/01/welcome-to-the-room/
37•kaonwarb•3d ago•27 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
236•isitcontent•15h ago•26 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
234•dmpetrov•16h ago•125 comments

Where did all the starships go?

https://www.datawrapper.de/blog/science-fiction-decline
33•speckx•3d ago•21 comments

UK infants ill after drinking contaminated baby formula of Nestle and Danone

https://www.bbc.com/news/articles/c931rxnwn3lo
12•__natty__•3h ago•0 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
335•vecti•17h ago•147 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
502•todsacerdoti•23h ago•244 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
386•ostacke•21h ago•97 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
300•eljojo•18h ago•186 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
361•aktau•22h ago•185 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
425•lstoll•21h ago•282 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
68•kmm•5d ago•10 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
96•quibono•4d ago•22 comments

Was Benoit Mandelbrot a hedgehog or a fox?

https://arxiv.org/abs/2602.01122
21•bikenaga•3d ago•11 comments

The AI boom is causing shortages everywhere else

https://www.washingtonpost.com/technology/2026/02/07/ai-spending-economy-shortages/
19•1vuio0pswjnm7•1h ago•5 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
265•i5heu•18h ago•217 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
33•romes•4d ago•3 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
64•gfortaine•13h ago•28 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1077•cdrnsf•1d ago•460 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
39•gmays•10h ago•13 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
298•surprisetalk•3d ago•44 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
154•vmatsiiako•20h ago•72 comments
Open in hackernews

We Can Just Measure Things

https://lucumr.pocoo.org/2025/6/17/measuring/
94•tosh•7mo ago

Comments

ToucanLoucan•7mo ago
Still RTFA but this made me rage:

> In fact, we as engineers are quite willing to subject each others to completely inadequate tooling, bad or missing documentation and ridiculous API footguns all the time. “User error” is what we used to call this, nowadays it's a “skill issue”. It puts the blame on the user and absolves the creator, at least momentarily. For APIs it can be random crashes if you use a function wrong

I recently implemented Microsoft's MSAL authentication on iOS which includes as you might expect a function that retrieves the authenticated accounts. Oh sorry, I said function, but there's two actually: one that retrieves one account, and one that retrieves multiple accounts, which is odd but harmless enough right?

Wrong, because whoever designed this had an absolutely galaxy brained moment and decided if you try and retrieve one account when multiple accounts are signed in, instead of, oh I dunno, just returning an error message, or perhaps returning the most recently used account, no no no, what we should do in that case is throw an exception and crash the fucking app.

I just. Why. Why would you design anything this way!? I can't fathom any situation you would use the one-account function in when the multi-account one does the exact same fucking thing, notably WITHOUT the potential to cause a CRASH, and just returns a set of one, and further, why then if you were REALLY INTENT ON making available one that only returned one, it wouldn't itself just call the other function and return Accounts.first.

</ rant>

layer8•7mo ago
How is an exception different from “returning an error message”?
dewey•7mo ago
Seems like the main differentiator is that one crashed and one doesn’t. Unrelated to error message or exception.
johnmaguire•7mo ago
I'm not sure I understand how both occurred at once. Typically an uncaught exception will result in a crash, but this would generally be considered an error at the call site (i.e. failing to handle error conditions.)
layer8•7mo ago
I understood “crashing” as them not catching the exception.

Most functions can fail, and any user-facing app has to be prepared for it so that it behaves gracefully towards the user. In that sense I agree that the error reporting mechanism doesn’t matter. It’s unclear though what the difference was for the GP.

ToucanLoucan•7mo ago
For one: terminating execution

More importantly: why is having more than one account an "exception" at all? That's not an error or fail condition, at least in my mind. I wouldn't call our use of the framework an edge case by any means, it opens a web form in which one puts authentication details, passes through the flow, and then we are given authentication tokens and the user data we need. It's not unheard of for more than one account to be returned (especially on our test devices which have many) and I get the one-account function not being suitable for handling that, my question is... why even have it then, when the multi-account one performs the exact same function, better, without an extra error condition that might arise?

TOGoS•7mo ago
> why is having more than one account an "exception" at all? That's not an error or fail condition

It is if the caller is expecting there to be exactly one account.

This is why I generally like to return a set of things from any function that might possibly return zero or more than one things. Fewer special cases that way.

But if the API of the function is to return one, then you either give one at random, which is probably not right, or throw an exception. And with the latter, the person programming the caller will be nudged towards using the other API, which is probably what they should have done anyway, and then, as you say, the returns-one-account function should probably just not exist at all.

lazide•7mo ago
Chances are, the initial function was written when the underlying auth backend only supported a single account (structurally), and most clients were using that method.

Then later on, it was figured out that multiple accounts per credential set (?!?) needed to be supported, but the original clients still needed to be supported.

And either no one could afree on a sane convention if this happened (like returning the first from the list), or someone was told to ‘just do it’.

So they made the new call, migrated themselves, and put in a uncaught exception in the old place (can’t put any other type there without breaking the API) and blam - ticket closed.

Not that I’ve ever seen that happen before, of course.

Oh, and since the multi-account functionality is obviously new and probably quite rare at first, it could be years before anyone tracks down whoever is responsible, if ever.

layer8•7mo ago
There’s no good way to solve this, though. Returning an arbitrary account can have unpredictable consequences as well if it isn’t the expected one. It’s a compatibility break either way.
lazide•7mo ago
Exactly, which is probably why a better ‘back compatibility’ change couldn’t be agreed on.

But there is a way that closes your ticket fast and will compile!

layer8•7mo ago
Sure, but not introducing the ability to be logged into multiple accounts isn’t the best choice as well. Arguably, throwing an exception upon multiple logins for the old API is the lesser evil overall.
ToucanLoucan•7mo ago
> There’s no good way to solve this, though.

Yes there is! Just get rid of it. It's useless. The re-implementation from using one to the other was barely a few moments of work, and even if you want to say "well that's a breaking change" I mean, yeah? Then break it. I would be far less annoyed if a function was just removed and Xcode went "hey this is pointed at nothing, gotta sort that" rather than letting it run in a way that turns the use of authentication functionality into a landmine.

lazide•7mo ago
I take it you’ve never had to support a widely used publicly available API?

You might be bound to support these calls for many, many years.

kfajdsl•7mo ago
> For one: terminating execution

Seems like you should have a generic error handler that will at a minimum catch unexpected, unhandled exceptions with a 'Something went wrong' toast or similar?

zahlman•7mo ago
> For one: terminating execution

Not if you handle the exception properly.

> why is having more than one account an "exception" at all? That's not an error or fail condition, at least in my mind.

Because you explicitly asked for "the" account, and your request is based on a false premise.

>why even have it then, when the multi-account one performs the exact same function, better, without an extra error condition that might arise?

Because other users of the library explicitly want that to be an error condition, and would rather not write the logic for it themselves.

Performance could factor into it, too, depending on implementation details that obviously I know nothing about.

Or for legacy reasons as described in https://news.ycombinator.com/item?id=44321644 .

wat10000•7mo ago
The iOS UI languages (ObjC and Swift) have three different mechanisms that are in the realm of exceptions/errors.

ObjC has a widespread convention where a failable method will take an NSError** parameter, and fill out that parameter with an error object on failure. (And it's also supposed to indicate failure with a sentinel return value, but that doesn't matter for this discussion.) This is used by nearly all ObjC APIs.

Swift has a language feature for do/try/catch. Under the hood, this is implemented very similarly to the NSError* convention, and the Swift compiler will automatically bridge them when calling between languages. Notably, the implementation does not do stack unwinding, it's just returning an error to the caller by mostly normal means, and the caller checks for errors with the equivalent of an if statement after the call returns. The language forces you to check for errors when making a failable call, or make an explicit choice to ignore or terminate on errors.

ObjC also has exceptions. In modern ObjC, these are implemented as C++ exceptions. They used to be used to signal errors in APIs. This never worked very well. One reason is that ObjC doesn't have scoped destructors, so it's hard to ensure cleanup when an exception is thrown. Another reason is that older ObjC implementations didn't use C++ exceptions, but rather setjmp/longjmp, which is quite slow in the non-failure case, and does exciting things like reset some local variables to the values they had when entering the try block. It was almost entirely abandoned in favor of the NSError* technique and only shows up in a few old APIs these days.

Like C++, there's no language enforcement making sure you catch exceptions from a potentially throwing call. And because exceptions are rarely used in practice, almost no code is exception safe. When an exception is thrown, it's very likely the program will terminate, and if there happens to be an exception handler, it's very likely to leave the program in a bad state that will soon crash.

As such, writing code for iOS that throws exceptions is an exceptionally bad idea.

Jabrov•7mo ago
"crash the app" sounds like the app's problem (ie. not handling exceptions properly) as opposed to the design of the API. It doesn't seem that unreasonable to throw an exception if unexpected conditions are hit? Also, more likely than not, there is probably an explicit reason that an exception is thrown here instead of something else.
raincole•7mo ago
> nowadays it's a “skill issue”

> throw an exception and crash the fucking app

Yes, if your app crashes when a third-party API throws an exception, it's a "skill issue" of you. This comment is an example why sometimes blaming the user's skill issue is valid.

jiggawatts•7mo ago
At the risk of being an amateur psychologist, your approach feels like that of a front end developer used to a forgiving programming model with the equivalent of the old BASIC programming language statement ON EFROR RESUME NEXT.

Server side APIs and especially authentication APIs tend towards the “fail fast” approach. When APIs are accidentally mis-used this is treated either as a compiler error or a deliberate crash to let the developer know. Silent failures are verboten for entire categories of circumstances.

There’s a gradient of: silent success, silent failure, error codes you can ignore, exceptions you can’t, runtime panic, and compilation error.

That you can’t even tell the qualitative difference between the last half of that list is why I’m thinking you’re primarily a JavaScript programmer where only the first two in the list exist for the most part.

audiodude•7mo ago
To me, it makes sense that "Give me the active/main/primary account", when multiple accounts are signed in, is inherently ambiguous. Which account is the main one? You suggest Accounts.first. Is that the first account that was signed into 3 years ago? Maybe you don't want that one then. Is it the most recently signed into account?

The designer of the API decided that if you ask for "the single account" when there are multiple, that is an error condition.

lostdog•7mo ago
A lot of the "science" we do is experimenting on bunches of humans, giving them surveys, and treating the result as objective. How many places can we do much better by surveying a specific AI?

It may not be objective, but at least it's consistent, and it reflects something about the default human position.

For example, there are no good ways of measuring the amount of technical debt in a codebase. It's such a fuzzy question that only subjective measures work. But what if we show the AI one file at a time, ask "Rate, 1-10, the comprehensibility, complexity, and malleability of this code," and then average across the codebase. Then we get measure of tech debt, which we can compare over time to measure if it's rising or falling. The AI makes subjective measurements consistent.

This essay gives such a cool new idea, while only scratching the surface.

delusional•7mo ago
> it reflects something about the default human position

No it doesn't. Nothing that comes out of an LLM reflects anything except the corpus it was trained on and the sampling method used. That definitionally true, since those are the very things it is a product of.

You get NO subjective or objective insight from asking the AI about "technical debt" you only get an opaque statistical metric that you can't explain.

BriggyDwiggs42•7mo ago
If you knew that the model never changed it might be very helpful, but most of the big providers constantly mess with their models.
cwillu•7mo ago
Even if you used a local copy of a model, it would still just be a semi-quantitative version of “everyone knows ‹thing-you-don't-have-a-grounded-argument-for›”
layer8•7mo ago
Their performance also varies depending on load (concurrent users).
BriggyDwiggs42•7mo ago
Dear god does it really? That’s very funny.
wiseowise•7mo ago
Why are you surprised? It’s a computational thing, after all.
BriggyDwiggs42•7mo ago
It’s not that crazy, just the architecture of differently quantized models and so on that you’d need to do that is impressive considering.
layer8•7mo ago
The models are the same, it's the surrounding processing like "thinking" iterations that are adjusted.
BriggyDwiggs42•7mo ago
That only works for LRMs no? Not traditional LLM inference.
layer8•7mo ago
We can just measure things, but then there’s Goodhart's law.

With the proposed way of measuring code quality, it’s also unclear how comparable the resulting numbers would be between different projects. If one project has more essential complexity than another project, it’s bound to yield a worse score, even if the code quality is on par.

Marazan•7mo ago
I would argue you can't compare between projects due to the reasons you state. But you can try and improve the metrics within a single project.

Cycolmatic complexity is a terrible metric to obsesses over yet in a project I was on it was undeniably true that the newer code written by more experienced Devs was both subjectively nicer and also had lower cycolmatic complexity than the older code worked on by a bunch of juniors (some of the juniors had then become some of the experienced Devs who wrote the newer code)

layer8•7mo ago
> But you can try and improve the metrics within a single project.

Yes. But it means that it doesn’t let you assess code quality, only (at best) changes in code quality. And it’s difficult as soon as you add or remove functionality, because then it isn’t strictly speaking the same project anymore, as you may have increased or decreased the essential complexity. What you can assess is whether a pure refactor improves or worsens a project’s amenibility to AI coding.

elktown•7mo ago
I think this is advertisement for an upcoming product. Sure, join the AI gold rush, but at least be transparent about it.
falcor84•7mo ago
Even if he does have some aspiration to make money by operationalizing this (which I didn't sense that he does), what Armin describes there is something that's almost trivial to implement a basic version of yourself in under an hour.
elktown•7mo ago
> which I didn't sense that he does

I'd take a wager.

the_mitsuhiko•7mo ago
If your wager is that I will build an AI code quality measuring tool then you will lose it. I'm not advertising anything here, I'm just playing with things.
elktown•7mo ago
> code quality measuring tool

I didn't, just an AI tool in general.

GardenLetter27•7mo ago
I'm really skeptical of using current LLMs for judging codebases like this. Just today I got Gemini to solve a tricky bug, but it only worked after providing it more debug output after solving some of it myself.

The first time I tried without the deeper output, it "solved" it by writing a load of code that failed in loads of other ways, and ended up not even being related to the actual issue.

Like you can be certain it'll give you some nice looking metrics and measurements - but how do you know if they're accurate?

the_mitsuhiko•7mo ago
> I'm really skeptical of using current LLMs for judging codebases like this.

I'm not necessarily convinced that the current generation of LLMs are overly amazing at this, but they definitely are very good at measuring inefficiency of tooling and problematic APIs. That's not all the issues, but it can at least be useful to evaluate some classes of problems.

falcor84•7mo ago
What do you mean that it "ended up not even being related to the actual issue"? If you give it a failing test suite to turn green and it does, then either its solution is indeed related to the issue, or your tests are incomplete; so you improve the tests and try again, right? Or am I missing something?
GardenLetter27•7mo ago
It made the other tests fail, I wasn't using it in agent mode, just trying to debug the issue.

The issue is that it can happily go down the completely wrong path and report exactly the same as though it's solved the problem.

cmrdporcupine•7mo ago
I explain this in sibling-node comment but I've caught Claude multiple times in the last week just inserting special-case kludges to make things "pass", without actually successfully fixing the underlying problem that the test was checking for.

Just outright "if test-is-running { return success; }" level stuff.

Not kidding. 3 or 4 times in the past week.

Thinking of cancelling my subscription, but I also find it kind of... entertaining?

jiggawatts•7mo ago
I just realised that this is probably a side-effect of a faulty training regime. I’ve heard several industry heads say that programming is “easy” to generate synthetic data for and is also amenable to training methods that teach the AI to increase the pass rate of unit tests.

So… it did.

It made the tests pass.

“Job done boss!”

falcor84•7mo ago
I found that working with an AI is most productive when I do so in an Adversarial TDD state of mind. As described in this classic qntm post [0] following the VW emissions scandal, which concludes with:

> Honestly? I blame the testing regime here, for trusting the engine manufacturers too much. It was foolish to ever think that the manufacturers were on anybody's side but their own.

> It sucks to be writing tests for people who aren't on your side, but in this case there's nothing which can change that.

> Lesson learned. Now it's time to harden those tests up.

[0] https://qntm.org/emissions

cmrdporcupine•7mo ago
I have mixed results but one of the more disturbing things I've found Claude doing is that when confronted with a failing test case, and not being able to solve a tricky problem.. just writing a kludge into the code that identifies that here's a test running, and makes it pass. But only for that case. Basically, totally cheating.

You have to be super careful and review everything because if you don't you can find your code littered with this strange mix of seeming brilliance which makes you complacent... and total Junior SWE behaviour or just outright negligence.

That, or recently, it's just started declaring victory and claiming to have fixed things, even when the test continues to fail. Totally trying to gaslight me.

I swear I wasn't seeing this kind of thing two weeks ago, which makes me wonder if Anthropic has been turning some dials...

quesera•7mo ago
> identifies that here's a test running, and makes it pass. But only for that case

My team refers to this as a "VW Bugfix".

alwa•7mo ago
I also feel like I’ve seen a lot more of these over the past week or two, whereas I don’t remember noticing it at all before then.

It feels like it’s become grabbier and less able to stay in its lane: ask for a narrow thing, and next thing you know it’s running hog wild across the codebase shoehorning in half-cocked major architectural changes you never asked for. [Ed.: wow, how’s that for mixing metaphors?]

Then it smugly announces success, even when it runs the tests and sees them fail. “Let me test our fix” / [tests fail] / [accurately summarizes the way the tests are failing] / “Great! The change is working now!”

cmrdporcupine•7mo ago
Yes, or I've seen lately "a few unrelated tests are failing [actually same test as before] but the core problem is solved."

After leaving a trail of mess all over.

Wat?

Someone is changing some weights and measures over at Anthropic and it's not appreciated.

throwdbaaway•7mo ago
It is possible that the tasks you gave to the model previously were just about easy enough for it to handle, while the few failing tasks you gave recently were a bit too tough for the model, thus it had to cheat.

For the exact same task, some changes in the system prompt used by Claude Code, and/or how it constructs the user prompt, can quite easily make the task either easy enough or not. It is a fine line.

yujzgzc•7mo ago
Another, related benefit of LLMs in this situation is that we can observe their hallucinations and use them for design. I've come up with a couple situations where I saw Copilot hallucinate a method, and I agreed that that method should've been there. It helps confirm whether the naming of things makes sense too.

What's ironic about this is that the very things that TFA points out are needed for success (test coverage, debuggability, a way to run locally etc) are exactly the things that typical LLMs themselves lack.

crazygringo•7mo ago
I've found LLM's to be extremely helpful in naming and general function/API design, where there a lot of different ways to express combinations of parameters.

I know what seems natural to me but that's because I'm extremely familiar with the internal workings of the project. LLM's seem to be very good at coming with names that are just descriptive enough but not too long, and most importantly follow "general conventions" from similar projects that I may not be aware of. I can't count the number of times an LLM has given me a name for a function that I've thought, oh of course, that's a much clearer name that what I was using. And I thought I was already pretty good at naming things...

autobodie•7mo ago
More often, LLMs give me wildly complex and over-engineered solutions.
steveklabnik•7mo ago
I have had this happen too, but a "this seems complex, do we need that complexity? can you justify it? or can we make this simpler?" or similar has them come back with something much better.
istjohn•7mo ago
I get good results with: "Can this be improved while maintaining simplicity and concision?"
autobodie•7mo ago
When I do that, they come back with something more simple than before but either wrong or still unnecessarily complex.

Over the past week, I have been writing a small library for midi controller I/O and simple/elegant is the priority. I am not really that opinionated. I just want it to not be overengineered. AI has been able to make some suggestions when I give it a specific goal for refactoring a specific class, but it cannot solve a problem on its own without overengineered solutions.

sfn42•7mo ago
You just have to be more specific. Don't just tell it to refactor, tell it how to refactor. I usually start out a bit vague then I add more specific instructions when I want specific changes.

I often just make the changes myself because it's faster than describing them.

You do the thinking, LLM does the writing. The LLM doesn't solve problems, that's your job. The LLMs job is to help you do the job more efficiently. Not just do it for you.

creatonez•7mo ago
True, but... a good chunk of the time, when an API doesn't exist, it really is because it shouldn't exist. I've seen many examples where an LLM hallucinates an API it wishes existed, but in reality implementing it upstream would turn it into an insane jumble or violate the internal logic in a very bad way. For example: inserting to the middle of a concurrent queue data structure.
timhigins•7mo ago
The title of this post really doesn’t match the core message/thesis, which is a disappointing trend in many recent articles.
stephc_int13•7mo ago
I think we very rarely can measure things as soon as they have more than one dimension and unit. Measurements aggregate are weighted and thus arbitrary and or incomplete.

This is a common and irritating intellectual trap. We want to measure things as this gives us a handle to apply algorithms or logical processes on them.

But we can only measure very simple and well defined dimensions such as mass, length, speed etc.

Being measurable is the exception, not the rule.