frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Start all of your commands with a comma

https://rhodesmill.org/brandon/2009/commands-with-comma/
143•theblazehen•2d ago•42 comments

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
668•klaussilveira•14h ago•202 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
949•xnx•19h ago•551 comments

How we made geo joins 400× faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
122•matheusalmeida•2d ago•33 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
53•videotopia•4d ago•2 comments

Jeffrey Snover: "Welcome to the Room"

https://www.jsnover.com/blog/2026/02/01/welcome-to-the-room/
17•kaonwarb•3d ago•19 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
229•isitcontent•14h ago•25 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
28•jesperordrup•4h ago•16 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
223•dmpetrov•14h ago•117 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
330•vecti•16h ago•143 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
494•todsacerdoti•22h ago•243 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
381•ostacke•20h ago•95 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
359•aktau•20h ago•181 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
288•eljojo•17h ago•169 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
412•lstoll•20h ago•278 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
63•kmm•5d ago•6 comments

Was Benoit Mandelbrot a hedgehog or a fox?

https://arxiv.org/abs/2602.01122
19•bikenaga•3d ago•4 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
90•quibono•4d ago•21 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
256•i5heu•17h ago•196 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
32•romes•4d ago•3 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
44•helloplanets•4d ago•42 comments

Where did all the starships go?

https://www.datawrapper.de/blog/science-fiction-decline
12•speckx•3d ago•5 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
59•gfortaine•12h ago•25 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
33•gmays•9h ago•12 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1066•cdrnsf•23h ago•446 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
150•vmatsiiako•19h ago•67 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
288•surprisetalk•3d ago•43 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
149•SerCe•10h ago•138 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
183•limoce•3d ago•98 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
73•phreda4•13h ago•14 comments
Open in hackernews

Libpostal: C library for parsing/normalizing street addresses around the world

https://github.com/openvenues/libpostal
104•nateb2022•7mo ago

Comments

jandrese•7mo ago
Wow, ambitious project. Anybody who has tried to verify addresses can tell you that the staggering number of different formats and conventions around the world make it and almost intractable problem. So many countries have wildly informal standards and people putting down just whatever they want because the mailman "just knows".
monero-xmr•7mo ago
Maxmind is the quintessential example of what devs want to build in their heart of hearts. Low-touch sales but b2b. Almost a monopoly. Prints money for decades. Not a public company so they never increase costs to a usurious amount. Open source never quite meets the level needed
derdi•7mo ago
> Anybody who has tried to verify addresses

Why would one try to "verify" addresses that one knows nothing about?

> because the mailman "just knows"

The mailman does "just know", and the mailman is who the address is for. Web forms I have seen that have tried to "verify" my address have never done so in a way that made the address better for the mailman.

EDIT: I've long thought that web forms should not have separate "street", "street line 2", "number", "apartment", "whatever" fields. Instead they should offer a multi-line input field labeled "this will go straight on the address label, write whatever you like but it's your problem if it doesn't arrive". You'd probably still need separate fields for town/postcode for calculating postage. And of course it wouldn't work because the downstream delivery company would also insist on something it can "verify".

kevin_thibedeau•7mo ago
For the US the underlying need for parsing is to determine a definitive location so that taxation, which can vary down to the municipality level, can be computed.
devilbunny•7mo ago
And even then isn’t necessarily enough. A college friend worked at a pizza place that did almost all business by delivery. The store itself actually crossed a city-county border. The cash registers were physically in the back, because that was the county (with lower sales tax). Technically, all money changed hands in the county, not the city.

I would be more suspicious of this story if I hadn’t seen that the registers were, actually, in the back. And they didn’t have a pickup window back there or anything.

jandrese•7mo ago
> Why would one try to "verify" addresses that one knows nothing about?

So you aren't shipping your product to some place that doesn't exist. Also, some KYC requires that you verify the address of the person.

derdi•7mo ago
> So you aren't shipping your product to some place that doesn't exist.

But businesses can't usually verify whether a place exists. The best they can usually do is to verify whether a place has an entry in their database of supposedly all places that supposedly existed at a point in time that is necessarily in the past.

That's not the same thing. Trust me, I would know: I live in a new-ish building, and for at least two years after it was completed and people were living here, some businesses still refused to take my money because they claimed that my address didn't exist. That was neither in their interest nor in mine.

> Also, some KYC requires that you verify the address of the person.

Define "verify". Verify that they provided some address that exists somewhere, possibly unconnected to the person? Worthless. Verify that they can receive mail at said address? OK, but doesn't require you to parse the address, just to print it onto a label and let the post office worry about it.

mpeg•7mo ago
It’s not just verifying that the address exists, for KYC you usually check if the address is linked to that person, and to do so you need some way to account for variations in the way the address is written.
grapesodaaaaa•7mo ago
In the shipping scenario, you can’t really know if it’s a local address or not without talking to someone with local knowledge.

The FAA even legally accepts “third house down from the barn” in some instances.

The KYC scenario is different, and a PITA for people like me, because I spent half my life without a physical mailing address (we picked it up at the post office).

The real world is messy, and u feel like SV and finance have done a lot of hand waving to ignore this.

Ameo•7mo ago
I used this at a previous company with quite good success.

With relatively minimal effort, I was able to spin up a little standalone container that wrapped around the service and exposed a basic API to parse a raw address string and return it as structured data.

Address parsing is definitely an extremely complex problem space with practically infinite edge cases, but libpostal does just about as well as I could expect it to.

degamad•7mo ago
Ditto - I was impressed with how well it handled the weird edge cases in our data.

They've managed to create a great working implementation of a very, very small model of a very specific subset of language.

ethan_smith•7mo ago
Worth noting that libpostal requires ~2GB RAM when fully loaded due to its comprehensive data models. For containerized deployments, we reduced memory usage by ~70% by compiling with only the specific country models needed for our use case.
degamad•7mo ago
Previously:

<https://news.ycombinator.com/item?id=18775099> Libpostal: A C library for parsing/normalizing street addresses around the world - 117 points by polm23 on Dec 29, 2018 (25 comments)

<https://news.ycombinator.com/item?id=11173920> Libpostal: international street address parsing in C trained on OpenStreetMap (mapzen.com) 74 points by riordan on Feb 25, 2016 (7 comments)

RobinL•7mo ago
There are many useful applications of libpostal, and it's an impressive library, but one I would caution against is for the purpose of address matching, at least as the 'primary' approach.

The problem is the hardest to parse addresses are also often the hardest to match, making the problem somewhat circular. I wrote about this more in a recent blog on address matching: https://www.robinlinacre.com/address_matching/

kleiba•7mo ago
Relevant? -> "Falsehoods programmers believe about addresses" (https://www.mjt.me.uk/posts/falsehoods-programmers-believe-a...)

Discussed on HN here: https://news.ycombinator.com/item?id=8907301

tempodox•7mo ago
… where the first falsehood is that a computer could be able to parse an address at all (let alone normalize it). Just take the address as given and leave the rest to the mail delivery person.
degamad•7mo ago
So relevant that it's even linked in the readme as advice to users! :-)
weinzierl•7mo ago
In the same vein, there is also Google's excellent libphonenumber for parsing, formatting, and validating international phone numbers.

And because I had no idea before I worked on a project where we had to deal with customer data: many companies also use commercial services for address and phone number validation and normalization.

wink•7mo ago
And yet we still have forms in 2025 that were coded to not strip spaces or dashes in phone numbers or grasp theconcept of +XX country prefixes.
kerkeslager•7mo ago
I think fundamentally, no parsing/normalizing library can be effective for addresses. A much better approach is to have a search library which finds the address you're looking for within a dataset of all the addresses in the world.

Addresses are fundamentally unstructured data. You can't validate them structurally. It's trivial to create nonexistent addresses which any parsing library will parse just fine. On the flipside, there's enough variety in real addresses that your parser has to be extremely tolerant in what it accepts--so tolerant that it basically tolerates everything. The entire purpose of a parser for addresses is to reject invalid addresses, so if your parser tolerates everything it's pointless.

The only validation that makes any sense is "does this address exist in the real world?". And the way to do that is not parsing, it's by comparing to a dataset of all the addresses in the world.

I haven't evaluated this project enough to understand confidently what they're doing, but I hope they're approaching this as a search engine for address datasets, and not as a parsing/normalizing library.

vidarh•7mo ago
And keeping such datasets up to date is another matter entirely, because clearly a lot of companies rely datasets that were outdated before their company even existed.

A trivially simple example of just how messy this is when people try to constrain it is that it's nearly random whether or not a given carrier would insist on me giving an incorrect address for my previous place, seemingly because traditionally and prior to 1965 the address was in Surrey, England.

The "postcode area name" for my old house is Croydon, and Croydon has legally been in London since 1965, and was allocated it's own postcode area in 1966. "Surrey" hasn't been correct for addresses in Croydon since then.

But at least one delivery company insisted my old address was invalid unless I changed the town/postcode area to "Surrey", and refused to even attempt a delivery. Never mind they had my house number and postcode, which was sufficient to uniquely identify my house.

kerkeslager•7mo ago
Agreed. Keeping an up-to-date dataset of addresses is enormously hard. It's impossible to do perfectly, and only a few companies are capable of doing it passably, while the rest of us have no choice but to buy from them.

But notably, to validate a parser/normalizer, you need this dataset anyway, so creating a parser/normalizer isn't even saving you that work. It's just giving you a worse result for more work.

derdi•7mo ago
> real world [...] dataset

You are equating two things that are not equatable.

kerkeslager•7mo ago
I don't think I am. And despite your mangling what I said beyond recognition, even your quotation of me doesn't make it look like I am.
derdi•7mo ago
I know that you are equating things that are not equatable, since I have personally been affected by businesses relying on "datasets" to claim that my real world address, which definitely existed, did not exist. Data is not the same as reality.
kerkeslager•6mo ago
> I know that you are equating things that are not equatable, since I have personally been affected by businesses relying on "datasets" to claim that my real world address, which definitely existed, did not exist.

It sounds like people at those businesses equated a dataset to the real world, not me. You're an adult, direct your frustrations appropriately.

> Data is not the same as reality.

That glosses over a lot of nuance.

Obviously, no dataset perfectly represents reality. But, this fact is often used to dismiss data entirely, resulting in people making decisions with absolutely no evidence whatsoever.

An appropriate use of an address database might be: when the user enters an address not in the database, do a fuzzy search and suggest the best match you can find, asking "Did you mean X?" At that point, if the user says, "No, I really meant what I put in," then you accept the data they gave you. This catches most mistakes while allowing users to put in addresses that aren't in your dataset.

derdi•6mo ago
That last suggestion makes a lot of sense. It makes a lot of sense specifically because it is the opposite of what you suggested above:

> The entire purpose of a parser for addresses is to reject invalid addresses, so if your parser tolerates everything it's pointless.

kerkeslager•6mo ago
The sentence you quoted contains no suggestion for how a site should behave.

It's bizarre to me that you're telling me I said things I didn't say, and then quoting things that don't say what you're claiming they say.

derdi•6mo ago
If the parser's rejection of an address doesn't influence the site's behavior, the site might as well not use the parser.
kerkeslager•6mo ago
Yes. Correct.

I'm saying that they should not use the parser, because the only ways it can influence the site's behavior are too buggy to be useful.

shakna•7mo ago
I somehow doubt this will pass the snifftest of one of my old addresses, which Australia Post successfully delivered to on a weekly basis:

    Third on right of main,
    Tiwi College,
    Melville Island, 0822, AU.
You can try to normalize that... But "Main Road" is in another city. Because I wasn't living in a city. There were no road names. And the 3rd position was an empty plot, not the third house. We had a bunch of houses around a strip of land, a few minutes from the airstrip - the only egress.
mrweasel•7mo ago
You also have to account for interestingly worded addresses. We had "

  Streetname 5, behind the glazier business.
  It might say <some other name> on the door
That's very specific, but also not really an address.
devilbunny•7mo ago
“Duzbuns Hopsit pfarmerrsc”

(For today’s 10000, that’s Terry Pratchett. The autocrat of the city of Ankh-Morpork amuses himself, at times, by figuring out where unreadably-addressed mail should go - in this case, a baker (“duzbuns” == does buns) across the street (“hopsit” == opposite) from a pharmacy, which in his extremely detailed knowledge of the city means only one place.)

ryao•7mo ago
I recall an episode of Fraiser where Niles moved into “The Montana” and it was so famous that he could just have people write his name followed by “The Montana” on envelopes to send mail to him. I believe that was based on the Dakota apartments in NYC. I have no idea if people at the actual Dakota apartments can do that, but I suspect the post offices in NYC would know to send mail there if it simply said a name followed by “The Dakota”.
usr1106•7mo ago
Something like that has not worked in Finland for several years. All addresses are scanned and matched by the mail with a DB of "valid addresses". There is a big student dorm in this city here, which has had problems with mail delivery for years. Not that students would receive a lot of mail. Most businesses charge extra for paper bills, most authorities prefer electronic messages and private postcards don't seem to be common in that age group either.

After years of undeliverable mail it was found that the building permit for the dorm was registered incorrectly by the city and as a result the rooms were never registered as residential addresses in the postal DB.

gorgoiler•7mo ago
I have a real soft spot for these codifications of everyday things. A lot of us do. See also tzdata, GNU units, pluralize(noun), humanize(timestamp), and SPICE astronavigation. And yes, locating Mars in the night sky is indeed an everyday thing!

What are some others?

ttw44•7mo ago
When I was first engaging into web development a year ago, I was making forms that took addresses. From a C and C++ background, I kept asking, what if they lived in a specific country? How can I make my database truly safe? What is the best way to store all these addresses? I immediately gave up on that effort. Very impressive.
claytongulick•7mo ago
Libpostal is great and was a lifesaver for me, but anyone who is interested in using it should be aware that it it NOT lightweight.

IIRC it takes gigs of storage space and has significant runtime requirements.

Also, while it's implemented in C there are language binding for most major languages [1].

It's one of those things where it's most likely best deployed as an independent service on a dedicated machine.

[1] https://github.com/openvenues/libpostal?tab=readme-ov-file#b...

alganet•7mo ago
Having used it in the past, I can firmly say it performs better than regex.