frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Start all of your commands with a comma

https://rhodesmill.org/brandon/2009/commands-with-comma/
140•theblazehen•2d ago•41 comments

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
667•klaussilveira•14h ago•202 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
949•xnx•19h ago•551 comments

How we made geo joins 400× faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
122•matheusalmeida•2d ago•32 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
53•videotopia•4d ago•2 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
229•isitcontent•14h ago•25 comments

Jeffrey Snover: "Welcome to the Room"

https://www.jsnover.com/blog/2026/02/01/welcome-to-the-room/
16•kaonwarb•3d ago•19 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
222•dmpetrov•14h ago•117 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
26•jesperordrup•4h ago•16 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
330•vecti•16h ago•143 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
493•todsacerdoti•22h ago•243 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
381•ostacke•20h ago•95 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
359•aktau•20h ago•181 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
288•eljojo•17h ago•169 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
412•lstoll•20h ago•278 comments

Was Benoit Mandelbrot a hedgehog or a fox?

https://arxiv.org/abs/2602.01122
19•bikenaga•3d ago•4 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
63•kmm•5d ago•6 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
90•quibono•4d ago•21 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
256•i5heu•17h ago•196 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
32•romes•4d ago•3 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
43•helloplanets•4d ago•41 comments

Where did all the starships go?

https://www.datawrapper.de/blog/science-fiction-decline
12•speckx•3d ago•4 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
59•gfortaine•12h ago•25 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
33•gmays•9h ago•12 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1066•cdrnsf•23h ago•446 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
150•vmatsiiako•19h ago•67 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
149•SerCe•10h ago•138 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
287•surprisetalk•3d ago•43 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
182•limoce•3d ago•98 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
73•phreda4•13h ago•14 comments
Open in hackernews

Compressing Icelandic name declension patterns into a 3.27 kB trie

https://alexharri.com/blog/icelandic-name-declension-trie
240•alexharri•6mo ago

Comments

jedimastert•6mo ago
It's like an interview question from hell. Reversing a trie is those things that I might ever use once in my life, but that one time I will look like an absolute wizard.
adammarples•6mo ago
I don't think they reversed the trie, they just reversed the names before putting them in?
jedimastert•6mo ago
That's what I meant yes.
treetalker•6mo ago
I remember that when I was first learning Spanish in high school, I found a piece of (Windows) software that pelted you with a series of pairs of an infinitive and a tense, and you had to conjugate the infinitive accordingly. (Spanish conjugation typically changes the end of the word; irregular verbs tend to involve stem changes). It was fantastic practice and really ingrained the rules; I became a whiz at it.

When I started learning Russian, the declensions (like the ones mentioned in the article) really threw me for a loop. I looked all over for a similar app to explain the patterns and drill rote practice, but never found one.

While slightly off-topic, does anyone know of such an app (web-based or macOS/iOS)?

leobg•6mo ago
https://memrussian.com/?
netsharc•6mo ago
Grandfather talks about classical Windows software. On the Play Store this app says "Contains ads - In-app purchases".

Ah, as a cheap bastard, I hate how software was pay once back then, and for this one I'm just going to ask you what's the monthly subscription price?

mpascale00•6mo ago
This comes up in so many threads here... How can we change the culture of subscriptions back to pay once???
necovek•6mo ago
It's not really about the culture anymore. Software that requires maintenance — and most does — has a continuous development cost. As such, subscription is the most natural way to cover it.

On the other hand, we have software which has low maintenance cost, but sold for peanuts ($0-$10) in small quantities, so authors try to introduce alternative revenue streams.

As in, it's fair to pay continuously (subscription) for continuous work (maintenance), so I don't expect that to go away. Ads, though, yuck...

sneak•6mo ago
Software sold today does not require maintenance. Software to work in the future requires maintenance. I am not buying future software. I am buying today software.

Increasingly I am not buying software at all.

perching_aix•6mo ago
This is a good argument in favor of subscriptions not being mandatory, but not in favor of the abolishment of subscriptions overall, which is what they were talking about.
3036e4•6mo ago
That is the old way. You bought some application and it came with upgrades until next major version release or similar. Then when that release came out you could decide to pay again or just keep using the old (now unsupported) version you already paid for.

That solved all the issues with paying for maintenance, but sadly someone must have figured out a mandatory subscription was a better way to make more money.

johannes1234321•6mo ago
It's not only a way to make more money, but it also matches better to modern development approaches.

Major versions come from a time where one had to produce physical media. Thus one could do a major release only every few years. Back then features had to be grouped together in a big bang release.

Nowadays one can ship features as they are being developed, with many small features changes all the time.

3036e4•6mo ago
That was probably true a long time ago, but I bought software using that model that did not have any physical releases and at least one had frequent minor releases adding new features.

It seems to me like the "subscription model" is exactly the same, except for the use of DRM and cloud dependencies to force users to pay for new versions. The only thing that changed was that the option to remain on an old version was taken away from users.

sgarland•6mo ago
On the contrary, software today is so absurdly buggy that it often does require maintenance to work.
charcircuit•6mo ago
Even ignoring security, bug fixes, new features, etc it is also not fair that you can get value from the app every month, but the developer doesn't get to capture a reward for any of this value. Having people pay monthly for value they get monthly seems reasonable.
BenjiWiebe•6mo ago
Does that mean you'd be in favor of subscriptions for owning a vehicle, rather than paying outright? Or a house?

The manufacturer/builder gets paid once, and you get value monthly.

charcircuit•6mo ago
Leasing cars and renting houses is already a common practice. So yes I believe these make sense to exist.

The existence of purchasing cars and houses with no ongoing cost to the builder is due to competition.

BenjiWiebe•6mo ago
Also leasing cars isn't usually (ever?) from the manufacturer of the car.

Houses, not sure how it's done in more populous areas, but around here you don't ever rent from the builder. You rent from someone who bought the house from a builder (or bought from someone who did, etc etc).

mpascale00•6mo ago
I disagree. You can read a book or listen to a record, watch a dvd, unlimited times, having fairly paid upfront a price for the item. A computer is general purpose and lets you check your email every day, hell even lets you create new value in the form of new software, without the manufacturer receiving a royalty.

The idea of capturing reward post-receipt is feudalistic.

charcircuit•6mo ago
The existence of products in competitive markets is not a counter example to what my point was. I recommend looking at the terms bottom up pricing and top down pricing. The former is about creating a price based off of how much it costs to do business and then adding a profit margin. The latter is creating price in line with how much value it offers customers. The existence of products using bottom up pricing doesn't mean top down pricing does not exist.
necovek•6mo ago
That's not how markets work (and I disagree that it would be reasonable).

Price is usually established based on how much something cost to make (materials, effort, profit), combined with market conditions (abundance/shortage of products, surplus cash/tough economy...).

If you want to continuously extract profit from consistent use of a hammer or vacuum cleaner, somebody else will trivially make a competing product at a lower price with no subscription.

charcircuit•6mo ago
>somebody else will trivially make a competing product at a lower price with no subscription.

And software like photoshop is not trivial to copy so it can survive being priced based off of value provided. There exists competitors that don't have a subscription, but they are not good enough to kill it.

sgarland•6mo ago
Given how profitable it is, I doubt it’ll be changed.

That said, I very much like Codeweavers’ approach [0], which IMO is the modern equivalent to purchasing software on a physical medium: you buy it, you can re-download it as many times as you’d like, install it on as many machines as you’d like (single-user usage only), and you get 1 year of updates and support. After that, you can still keep using it indefinitely, but you don’t get updates or paid support. You get a discount if you renew before expiry. They also have a lifetime option which, so far, they’ve not indicated they’re going to change.

I have no affiliation with them, I just think it’s a good product, and a good licensing / sales model.

[0]: https://www.codeweavers.com/store

mpascale00•6mo ago
Profitable for sure, but I'm often half surprised by the lack of competition against subscription-based everything these days.
nsksl•6mo ago
Find a pirate version if possible…
GuB-42•6mo ago
I don't know about this app but many of the "Contains ads - In-app purchases" apps offer to remove the ads for a one-time payment.
yorwba•6mo ago
You might be able to build something similar yourself using declension data extracted from Wiktionary using wiktextract: https://github.com/tatuylonen/wiktextract#pre-extracted-data
jeffwass•6mo ago
When I was learning Spanish (on my own) 25 years ago I had a Spanish/English dictionary. It only translated verbs to Spanish infinitive, but each had a numerical index mapping it to a class of verbs with the same conjugation pattern.

There was a section at the front of the dictionary with full conjugation patterns over all tenses for one sample verb in each class.

Eg, each type of stem-changing verb fell into one index, full irregulars were singletons in their own class, some irregulars that behave similarly (iirc tener and detener) shared one class.

So all verbs in Spanish fell neatly into a few dozen unique patterns, and the indexing was already done.

I was going to build a quiz software just like you mentioned to conjugate any verb in any tense, but “never got around to it”.

I wonder how the reverse-string trie pattern in the article would be for reconstructing the class mapping.

kashunstva•6mo ago
> … learning Russian… explain the patterns… such an app

Non-native Russian speaker here. In the past, I cobbled together some scripts that use the spaCy Python module with the larger of the two Russian modules to provide context-aware lemmatization and grammatical tag extraction.

On the whole, though, my biggest gains in Russian were in letting go of the need to analytically deconstruct the inflections and instead build up a mental library of patterns (and exceptions) in my head through use.

EDIT: I mean context within a sentence, not a broader meaning.

Rendello•6mo ago
There's some Anki (flashcard) decks that use the "KOFI" method:

> KOFI (Konjugation First) is the name I've given to a provocative language-learning approach I've created: to learn all the forms of a language's conjugation before even starting to formally study the language

I used the French one, years after I learned French, because my conjugation was abysmal. You can get by using basic tenses or wrong tenses, and people will understand you, but it's not what you want. The KOFI method is supposed to teach you all the conjugation patterns in a matter of months before learning the language, I'd like to give it a try in-earnest some day for a new language. My interest in French has waned so I didn't stick with it.

https://ankiweb.net/shared/info/1131659186

gametorch•6mo ago
I used Clozemaster effectively to learn Russian. It's not exactly what out describe, but you can fly through many "clozes" to ingrain the patterns into your brain.
jdcarr•6mo ago
I use ConjuGato on iOS for practicing Spanish conjugations. There’s a game mode where you’re given an infinitive/tense/person and think of the conjugation and you can filter it down to solely irregular verbs to learn the exceptions
LoneGeek•6mo ago
If you can read Russian, there is a Python app for morphological analysis called pymorphy3. Documentation: https://pymorphy2.readthedocs.io/en/stable/.

It is based on an OpenCorpora dictionary: https://opencorpora.org/dict.php

This dictionary is based on a Zaliznyak dictionary, which is always referenced in Wiktionary's articles.

wchar_t•6mo ago
Do you happen to still have a link to this software?
1-more•6mo ago
I had an idea for a flash card generator for Russian that would do preposition + adjective + noun to get faster at declining in my head; I had done Latin before that and no one expects you to do Latin declension quickly (unless you're a monk maybe?). Never went anywhere with it, naturally.
lifthrasiir•6mo ago
A possible alternative, especially for beygla/strict, would be perfect hashing.
Scaevolus•6mo ago
You can compress even better than standard perfect hashing because not all values are unique, so collisions might be allow you to store multiple name -> suffix combos in the same bucket.

Of course, that would mean you lose the ability to say "name not handled".

kmmbvnr_•6mo ago
Doesn't that look like an interesting approach for highly optimized embeddings?
robin_reala•6mo ago
No idea if Rails copes with this automatically, but it feels like the sort of magic it’s historically been really good at. I remember reading the source code for `pluralise` and finding that someone had encoded the pluralisation rules including irregular cases for Welsh.
Alifatisk•6mo ago
Love Rails, there is a method for everything
dmurray•6mo ago
For the 800 names that were missing declension data in the database, it seems like the most straightforward thing to do would be to assign their declensions by hand. It shouldn't take a native speaker more than a couple of hours (if some name they haven't seen before is ambiguous, then whatever they guess at least won't sound obviously wrong to other native speakers). Alternatively, very cheap to ask an LLM to do it.

Encoding them into a trie like this would still be a good way to distribute the result, but you don't have to rely on the trie also being a good way to guess the declensions.

perching_aix•6mo ago
Yeah, that'd be a good idea. That said, it still wouldn't resolve the issue for names that are in-use despite not being approved (or foreign names).

I also live in a country with a centrally governed personal name list, but you can request exceptions, and there are people who were born before the list existed, so their names won't necessarily be on the list either. Immigrants can also retain their names during naturalization I believe, and there can be lots of other complications still. So the ability to sorta-kinda predict the proper declension is still useful.

thaumasiotes•6mo ago
Related: https://en.wikipedia.org/wiki/Naming_laws_in_China#Ma_Cheng
wizzwizz4•6mo ago
I see no reason that an LLM should be better at guessing than a trie (unless the actual example was in its training data, in which case a web search would be more appropriate).
dmurray•6mo ago
I agree. I just like having the guessing done at compile time on principle. It allows you to change a guess, if you find that it's wrong, and convince yourself that you haven't broken any of the other cases where you were previously accidentally right.
wizzwizz4•6mo ago
My main objection is the temptation to mix real and fabricated data. Your entire dataset becomes much less useful if it's got nonsense mixed in with it, and if historical examples are anything to go by, it can be hundreds of years before someone identifies and untangles the nonsense from the fact. Any minor benefit is not worth this risk imo.
esafak•6mo ago
I wonder if existing LLMs already know these patterns?
jer0me•6mo ago
The Icelandic government has been proactive about helping OpenAI train its models on the language to stave off extinction: https://openai.com/index/government-of-iceland/
xigoi•6mo ago
If they’d rather support open-source models so the future of the language is not in the hands of a single foreign corporation…
thaumasiotes•6mo ago
Yes, this is an example of a problem that an LLM is ideally suited to solve.
alexharri•6mo ago
It would be good to cover more names for sure -- that's an ongoing process at DIM. Names are frequently added to the approved list of Icelandic names, so there's always going to be some lag.

I would not be confident enough myself to add the data myself since I'd probably be wrong a lot of the time. When reviewing the results for the top 100 unknown names I frequently got results that I thought _might_ be wrong, but I wasn't sure. For those, I looked up similar names in DIM to verify, and often thought "huh, I would not have declined those names like this". For that reason, I rely on the DIM data as the source of truth since it's maintained by experts on the language.

alucardo•6mo ago
Hmm, is this lib GDPR compliant?
detaro•6mo ago
Why wouldn't it be?
bot403•6mo ago
If this isn't compliant than neither are name day calendars or baby name websites.

It's not a privacy issue if it's just "someone's" name.

kiicia•6mo ago
GDPR is about accountability for handling identifiers like full name of actual person. Using parts of names, where each part does not identify any particular person, in generalized list like described here does not fall under GDPR.
shagie•6mo ago
There are a relatively finite number of Icelandic names. https://en.wikipedia.org/wiki/Icelandic_Naming_Committee

> A name not already on the official list of approved names must be submitted to the naming committee for approval. A new name is considered for its compatibility with Icelandic tradition and for the likelihood that it might cause the bearer embarrassment. Under Article 5 of the Personal Names Act, names must be compatible with Icelandic grammar (in which all nouns, including proper names, have grammatical gender and change their forms in an orderly fashion according to the language's case system).

A database of those names is no more interesting or personal than a dictionary or list of names ( https://www.insee.fr/en/statistiques/6536067 ) in another language... which is where they got the data.

> Iceland has a publicly run institution, Árnastofnun, that manages the Database of Icelandic Morphology (DIM). The database was created, amongst other reasons, to support Icelandic language technology.

https://bin.arnastofnun.is/DMII/aboutDMII/

There is no more personal information being presented than saying John or providing https://en.wikipedia.org/wiki/John_(given_name) or https://www.wolframalpha.com/input?i=John

John may be your given name, but that data isn't personal data. One of the numbers 1969, 1978, 1987, 1996 might be your birth year... but https://oeis.org/A101039 isn't personal information either. Combining John with Smith and 1978 as the year of someone's birth... now you've got personal information that would be covered by the GDPR.

ralferoo•6mo ago
That's not quite what qualifies it as PII.

> John may be your given name, but that data isn't personal data. One of the numbers 1969, 1978, 1987, 1996 might be your birth year... but https://oeis.org/A101039 isn't personal information either. Combining John with Smith and 1978 as the year of someone's birth... now you've got personal information that would be covered by the GDPR.

Just the facts "John" or "Smith" or "1978" aren't PII, but any single one attached to some other data is, because then that provides partial identification of that other data. So, for instance an attribution of a forum post to "John" is PII, even if there are thousands of other Johns using the system.

Actually, even that's not necessarily true. The mere fact that you are acknowledging a user exists with that name may make it PII. It's not a big deal to say our usernames include "John", "Mark", etc if there are literally thousands of them, but it's a big deal if one of the usernames is an incredibly rare name or spelling. In this case, the list presented in the article isn't PII, because the list is just a list of names downloaded from a government site that represent possible acceptable names. Just having that list provides no information about whether anyone with any of those names is using your service.

radpanda•6mo ago
> There are, in fact, 88 approved Icelandic names with this exact pattern of declension, and they all end with “dur”, “tur” or “ður”.

…

> But that quickly breaks down. There are other names ending with “ður” or “dur” that follow a different pattern of declension

My “everything should be completely orderly” comp-sci brain is always triggered by these almost trivial problems that end up being much more interesting.

Is the suffix pattern based on the pronunciation of the syllable(s) before the suffix? If one wanted to improve upon your work for unknown names, rather than consider the letters used, would you have to do some NLP on the name to get a representation of the pronunciation and look that up (in a trie or otherwise)?

dmit•6mo ago
> Is the suffix pattern based on the pronunciation of the syllable(s) before the suffix?

Careful, this is how you fall down the Are Dependent Types The Answer?? hole.

perching_aix•6mo ago
Not sure what that's supposed to mean, but if Icelandic is anything like my native language in this, then it is indeed a pronunciation based thing. Which should make sense, since languages are (historically) spoken first, written second.
dmit•6mo ago
Heheh, it was mostly a reference to my [and mostly others'!] experiments with encoding human languages in a programming language. There are some pretty neat ideas there to explore, like the difference between Subject-Object-Verb (SOV) and Object-Subject-Verb. Or postfix languages (e.g. Forth) mapping to some human languages.

In this particular example, having a subsequent part of an expression rely on prior parts would usually be accomplished at runtime in most languages. But some (like Idris) might allow you to encode the rules in the type system. Thus the rabbit hole.

perching_aix•6mo ago
Ah okay. That's a journey I'm currently also preparing to embark on, though from the other direction: I'm trying to generate "natural" language from program code. I already know it's pretty hopeless, but increasingly I feel like it's not really a choice anyhow, so I may as well finally have a go at it. Let's see :)
dmit•6mo ago
Godspeed!
alexharri•6mo ago
Hmm, good idea. There are names that have the exact same pronunciation yet have different patterns of declension, for example:

- Ástvaldur -> ur,,i,ar - Baldur -> ur,ur,ri,urs

The "aldur" ending is pronounced in the exact same manner, but applying the declension pattern of "Ástvaldur" to "Baldur" would yield:

- Baldur - Bald - Baldi - Baldar

The three last forms feel very wrong (I asked my partner to verify and she cringed).

Spoken Icelandic is surprisingly close to its written form. I wouldn't expect very different results for the trie if a "phonetic" version of names and their endings were used instead of their written forms

sneak•6mo ago
This seems complicated.

Why not just reuse the existing standard and change everyone’s last names to Kim, Lee, or Park?

dmit•6mo ago
> everyone’s last names

*surnames. Not last in that case, whatever the case is you're trying to make.

yujzgzc•6mo ago
Valiant effort at old-school engineering applied to a niche problem. (Iceland has a population of only around 400,000 people!) As much as I love the geekery of this stuff though, isn't it already a better ROI to get an LLM to generate the strings you need? It has its own other problems (not claiming it'll be perfect) but for something so language related, it makes a lot of sense. Would also work for other languages that have the same problem with declension of proper nouns like Russian or Finnish.
tomsmeding•6mo ago
The article describes that a government body is using this library to generate indictments. In that situation, you do not want something that is mostly usually correct. Indeed, they asked the author for a strict version that does not try to guess the declension of unknown names based on their suffix, presumably so that they can just not decline them, which is better than picking the wrong declension 0.05% of the time.
yujzgzc•6mo ago
By the author's own evaluation the solution proposed is not always correct either. The real danger would be the LLM hallucinating a different name altogether.
silvestrov•6mo ago
One more optimization idea: instead of the trie mapping to the suffix string directly, then instead make an array of unique suffixes and let the trie map to the index into the array, e.g.

    const suffixes = [",,,", "a,u,u,u", ",,i,s", ",,,s", "i,a,a,a", ...];
and then use the index of this list in the

    var serializedInput = "{e:{n:{ein:0_r: ...
KTibow•6mo ago
I (Claude Code) tried this and it actually increased the gzipped size by 100b (3456 -> 3556), only reducing the non-compressed size by 20%, likely because gzip is really good at interning repeated patterns already.
contravariant•6mo ago
You could go a step further by putting the suffixes themselves into the trie and then identifying identical subtrees.

If you can use gzip there's bound to be a clever way of using a suffix array as well, that might end up being better unless you can use an optimised binary format for the tree.

ryanjshaw•6mo ago
An interesting article but I was surprised there was no discussion about what humans do to address this problem?
Zanfa•6mo ago
They stick with the nominative case. That’s the only safe way not to butcher somebody’s name in a language like Estonian that has 14 cases. It’s infinitely easier to update copy to use only nominative than try to apply the cases automatically.
alexharri•6mo ago
As a native Icelandic speaker, I have an intuition for how to decline names -- I don't really think about it consciously. I'd assume that for most people it's just pattern matching.

Native speakers very frequently decline names in ways that are not technically perfect but sound correct enough. For example, my name (Alex) should not be declined, but people frequently use the declension pattern (Alex, Alex, Alexi, Alexar).

There's some parallel to be drawn with how the compressed trie applies patterns that it's learned to names. That's at least how I thought about it when designing the library.

mikepurvis•6mo ago
I’m surprised there’d be a benefit to doing this in the JS vs having your database just return all the cases with the name and then you select which one you need at display time — basically in the same layer that’s populating your localized language templates.

That said I’m curious how this manifests with cross-language situations. I guess the Icelandic UI displaying French names would just always use the nomitive case, and likewise for the English UI displaying Icelandic names? I assume this all mostly matters where the user is directly being addressed, or perhaps in an admin panel (“user x responded to user y”).

tempodox•6mo ago
Is Icelandic name declension deterministic enough that this method reliably works? That would be a lucky break. Language is typically quite messy.
nkrisc•6mo ago
It probably helps that Iceland has a relatively small population and the language is actively managed by the government.
ralferoo•6mo ago
I mean, it's an interesting problem for Icelandic sites, but because he's explaining the basic concepts of how declensions work, it seems like he's aiming this at non-Icelandic developers. If they were to use this, no doubt it'll end up butchering names in some other language and lead to all manner of hard to track down bugs.

For example, if an English person called Arthur uses the site in Icelandic, I'm not sure they'd expect their name to be changed to presumably "Arth", "Arthi" or "Arthar" even if they were a keen learner of Icelandic. Their name is their name. So, as well as storing someone's name, you also have to ask them what language their name is, or guess and get it wrong. At that point, you might as well just ask them for all the different forms for the name as well, and then you don't have to worry about whether their name is on an approved list or not.

And if the website isn't localised into Icelandic, I've also got to wonder if Icelandic visitors would have an expectation of Icelandic grammar rules being applied to English (or whatever) text. Most Icelandic people I've spoken to before have an excellent command of English anyway, and I'm sure they'd understand why their name isn't changing form in English.

pelorat•6mo ago
Not sure how it is nowadays, but Iceland used to force anyone immigrating to officialy change or "icelandify" their names.

So if your name was Arthur, and you wanted to emigrate to Iceland you would you change name.

Might still be like this.

SonOfLilit•6mo ago
My brain is screaming that there has to be a solution in <1kb uncompressed (for the non-strict version).

Maybe generating a minimal list of regexes that classifies 100% of names correctly? Maybe a big enough bloom filter? Maybe like a bloom filter but instead of hashes we use engineered features?