frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Start all of your commands with a comma

https://rhodesmill.org/brandon/2009/commands-with-comma/
194•theblazehen•2d ago•56 comments

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
679•klaussilveira•14h ago•203 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
954•xnx•20h ago•552 comments

How we made geo joins 400× faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
125•matheusalmeida•2d ago•33 comments

Jeffrey Snover: "Welcome to the Room"

https://www.jsnover.com/blog/2026/02/01/welcome-to-the-room/
25•kaonwarb•3d ago•21 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
62•videotopia•4d ago•2 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
235•isitcontent•15h ago•25 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
40•jesperordrup•5h ago•17 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
227•dmpetrov•15h ago•121 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
332•vecti•17h ago•145 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
499•todsacerdoti•22h ago•243 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
384•ostacke•21h ago•96 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
360•aktau•21h ago•183 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
292•eljojo•17h ago•182 comments

Where did all the starships go?

https://www.datawrapper.de/blog/science-fiction-decline
21•speckx•3d ago•10 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
413•lstoll•21h ago•279 comments

ga68, the GNU Algol 68 Compiler – FOSDEM 2026 [video]

https://fosdem.org/2026/schedule/event/PEXRTN-ga68-intro/
6•matt_d•3d ago•1 comments

Was Benoit Mandelbrot a hedgehog or a fox?

https://arxiv.org/abs/2602.01122
20•bikenaga•3d ago•10 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
66•kmm•5d ago•9 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
93•quibono•4d ago•22 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
260•i5heu•17h ago•202 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
33•romes•4d ago•3 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
38•gmays•10h ago•13 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1073•cdrnsf•1d ago•459 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
60•gfortaine•12h ago•26 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
291•surprisetalk•3d ago•43 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
150•vmatsiiako•19h ago•71 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
155•SerCe•10h ago•144 comments

The AI boom is causing shortages everywhere else

https://www.washingtonpost.com/technology/2026/02/07/ai-spending-economy-shortages/
8•1vuio0pswjnm7•1h ago•0 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
187•limoce•3d ago•102 comments
Open in hackernews

UTF-8 history (2003)

https://doc.cat-v.org/bell_labs/utf-8_history
101•mikecarlton•4mo ago

Comments

theologic•4mo ago
Great story and brought up on Hacker News on a regular cycle: https://news.ycombinator.com/item?id=26735958

While I love the Hacker News purity, takes me back to Usenet, it makes me wonder if a little AI could take a repost and auto insert previous postings to allow people to see previous discussions.

flohofwoe•4mo ago
WinNT missing out on UTF-8 and instead going with UCS-2 for their UNICODE text encoding might have been "the other" billion dollar mistake in the history of computing ;)

There was a 9 month time window between the invention of UTF-8 and the first release of WinNT (Sep 1992 to Jul 1993).

But ok fine, UTF-8 didn't really become popular until the web became popular.

But then missing the other opportunity to make the transition with the release of the first consumer version of WinNT (WinXP) nearly a decade later is inexcusable.

rfl890•4mo ago
And nowadays developers have to deal with the "A/W" suffix bullshit.
ivanjermakov•4mo ago
Windows using CP-125X encodings by default in many countries instead of a UTF-8 did a lot of damage, at least in my experience.
andrewl-hn•4mo ago
For many European languages like French or German the switch from local CP-encodings meant that only some characters like å, ñ, ç, etc. would require extra bytes. And thus the switch to UTF-8 was a no-brainer.

On the other hand, Cyrillic and Greek are two examples of short alphabets that allowed combining them with ASCII into a single-byte encoding for countries like Greece, Bulgaria, Russia, etc. For those locations switching to UTF-8 meant that you need extra bytes for all characters in a local language, and thus higher storage, memory, and bandwidth requirements for all computing. So, non-Unicode encodings stuck there for a lot longer.

gblargg•4mo ago
And back then Unicode was just 16 bits so UTF-8 wasn't such an obvious advantage in flexibility.
anonymars•4mo ago
"UTF-8 was first officially presented at the USENIX conference in San Diego, from January 25 to 29, 1993" (https://en.wikipedia.org/wiki/UTF-8)

Hey team, we're working to release an ambitious new operating system in about 6 months, but I've decided we should burn the midnight oil to rip out and redo all of the text handling we worked on to replace it with something that was just introduced at a conference..

Oh and all the folks building their software against the beta for the last few months, well they knew what they were getting themselves into, after all it is a beta (https://books.google.com/books?id=elEEAAAAMBAJ&pg=PA1#v=onep...)

As for Windows XP, so now we're going to add a third version of the A/W APIs?

More background: https://devblogs.microsoft.com/oldnewthing/20190830-00/?p=10...

nostrademons•4mo ago
Interestingly, there is another story on the HN front page about Steve Wozniak doing exactly that for the Apple I:

https://news.ycombinator.com/item?id=45265240

toast0•4mo ago
The 6502 and the 6800 are pretty similar. The 6501 was pin compatible with the 6800, but not software compatible; the 6501 was dropped as part of a settlement with Motorola.

Changing an in-progress system design to a similar chip that was much less expensive ($25 at the convention vs $175 for a 6800, dropped to $69 the month after the convention) is a leap of faith, but the difference in cost is obvious justification, and the Apple I had no legacy to work with.

It would have been great if Windows NT could have picked up utf-8, but it's a bigger leap and the benefit wasn't as clear; variable width code points are painful in a lot of ways, and 16-bits for a code point seemed like it would be enough for anybody.

edflsafoiewq•4mo ago
My takeaway from this story has always been that both MS and Plan 9 simply passively implemented Unicode as received. It was only IBM that had the vision to see that the encoding was wrong and they should make a new one.
anonymars•4mo ago
But doesn't OS/2 still use UCS-2 internally? And only years later (1995+)?

Potential source: https://ia802804.us.archive.org/13/items/os2developmentrelat...

bobmcnamara•4mo ago
Can't imagine they would've wanted to change encoding between Win3.1 and NT3.1.
masfuerte•4mo ago
But they did?
bobmcnamara•4mo ago
UCS-2 support was first released with some add on for Win3.1(which also still did most stuff with multiple character sets).
kimixa•4mo ago
IIRC Win32s (the subset of win32 released for windows 3.1) only added UCS-2 string processing, none of the system wide character APIs.

I think all the actual OS was still codepage (with the "multibyte" versions for things like Eastern languages being pretty much forks), and windows95 wasn't really much different.

roytam87•4mo ago
win32s brings codepage to/from widechar APIs and codepage table files (P_*.NLS), and a setting "AnsiCP=" in [NLS] section in win32s.ini.

16bit IE brings its own MSNLS.DLL for handling different codepages to ACP(Active Codepage) in Win3.1x.

and win9x also works mainly in ANSI codepage with some kernel side unicode support.

layer8•4mo ago
The history is more complicated than that. Originally, ISO/IEC 10646 and Unicode were two separate efforts, with only ISO having 31-ish-bit ideas, and Unicode being strictly 16 bits [0][1]. UTF-8 as in TFA was clearly developed to cover the 31-bit ISO character collection, the encoding going up to 6 bytes per character, although a couple of years later this was restricted to the 4 bytes sufficient to cover the 20 bits of Unicode 2.0 (1996). The initial UTF-8 development is therefore somewhat beyond the scope of what Unicode 1.x was about at the time.

Furthermore, the development of Windows NT already began in 1989 (then planned as OS/2 3.0) and proceeded in parallel to the finalization of Unicode 1.0, and to its eventual adoption by ISO that lead to Unicode 1.1 and ISO/IEC 10646-1:1993. It was natural to adopt that standardization effort.

Once established, the 16-bit encoding used by Windows NT was engrained in kernel and userspace APIs, notably the BSTR string type used by Visual Basic and COM, and importantly in NTFS. Adopting UTF-8 for Windows XP would have provided little benefit at that point, while causing a lot of complications. For backwards compatibility, something like WTF-8 would effectively have been required, and there would have been an additional performance penalty for converting back and forth from the existing WCHAR/BSTR APIs and serializations. It wasn't remotely a viable opportunity for such a far-reaching change.

Lastly, my recollection is that UTF-8 only became really widespread on the web some time after the release of Windows XP (2001), maybe roughly around Vista.

[0] https://en.wikipedia.org/wiki/Universal_Coded_Character_Set#...

[1] "Internationalization and character set standards", September 1993, https://dl.acm.org/doi/pdf/10.1145/174683.174687

xenadu02•4mo ago
It is worth reading the history of the proposal. The final form is superior to the others so someone was doing a lot of editing!

Take the final and second form where the use of multiple letters was eliminated, instead using "v" to indicate bits of the encoded character.

I also chuckle at the initial implementation's note about the desire to delete support for 4/5/6 byte versions. Someone was still laboring under the UCS/UTF-16 delusion that 16-bits was sufficient.

Rendello•4mo ago
They pretty much got their wish, bytes 5 and 6 are gone, along with half of byte 4!

The RFC that restricted it: https://www.rfc-editor.org/rfc/rfc3629#page-11

A UTF-8 playground: https://utf8-playground.netlify.app/

bikeshaving•4mo ago
There are socio-economic reasons why the early computing boom (ENIAC, UNIVAC, IBM mainframes, early programming languages like Fortran and COBOL) was dominated by the US: massive wartime R&D, university infrastructure, and a large domestic market. But I wonder if the Anglophone world also had an orthographic advantage as well. English uses 26 letters with no diacritics, compared to other languages like Chinese (thousands of characters), Hindi (50+ letters), or French/German (latin with diacritics).

That simplicity made early character encodings like 7-bit ASCII feasible, which in turn lowered the hardware and software barriers for building computers, keyboards, and programming languages. In other words, the Latin alphabet’s compactness may have given English-speaking engineers a “low-friction” environment for both computation and communication. And now it’s the lingua franca for most computing on top of which support for other languages is now built.

It’s very interesting to think about how written scripts give different cultures advantages in computing and elsewhere. I wonder for instance how scripts and AI interact, like LLMs trained in Chinese are working with a high-density orthography with a stable, 3500 year dataset.

ummonk•4mo ago
The same applies to why China had all the building blocks (pun intended) of the printing press but it was perfected by Gutenberg in Europe, where the number of glyphs was much more manageable.
zahlman•4mo ago
Indeed. Even if you try to split hanzi into parts it's far more unwieldy (https://en.wikipedia.org/wiki/Kangxi_radicals).
pklausler•4mo ago
We got lots done with 6-bit pre-ASCII encodings, actually, like CDC Display Code and Univac's Fieldata. It's more than enough for 26 letters, 10 digits, and lots of punctuation. And there are faint echoes of these early character sets remaining in ASCII -- a zero byte is ^@, for example, because @ was the zero-valued Fieldata "master space" character, which distinguished EXEC 8 control cards from source code and data cards.
duskwuff•4mo ago
> a zero byte is ^@, for example, because...

A zero byte is ^@ because 0x00 + 64 = '@'. The same pattern holds for all C0 control codes.

pklausler•4mo ago
Yes, and why is '@' at 0x40?
kps•4mo ago
Computer character codes descended directly from pre-computer codes, either teletype or punched card. The advantage holds back through printing to writing itself; having a small, fixed set of glyphs that can represent anything is just better.
Pet_Ant•4mo ago
I really wonder why Arabic has never gone back to printing. What we think of the Arabic "alphabet" is just it's cursive form. They have an alphabet that is basically just Syriac. Would have been easier to render on low bit displays. wouldn't have to deal with the word initial variants etc.

https://en.wikipedia.org/wiki/Nabataean_script

kps•4mo ago
Same with Japan using mostly kanji when they have a syllabary available (while Korea invented a pretty neat alphabet and largely dropped hanja).
kevin_thibedeau•4mo ago
Japanese has (slightly) more homophones and favors monosyllabic Sino-Japanese in compound words. That makes it hard to depend entirely on phonetic script. Same reason why English retains irregular spellings to help with some disambiguation.
jcranmer•4mo ago
> English uses 26 letters with no diacritics, compared to other languages like Chinese (thousands of characters), Hindi (50+ letters), or French/German (latin with diacritics).

The English language has diacritics (see words like naïve, façade, résumé, or café). It's just that the English language uses them so rarely that they are largely dropped in any context where they are hard to introduce. Note that this adaptation to lack-of-diacritic can be found in other Latin script languages: French similarly is prone to loss-of-diacritic (especially in capital letters), whereas German has alternative spelling rules (e.g., Schroedinger instead of Schrödinger).

inglor_cz•4mo ago
There must be some alternate universe where WWII never happened, all the talented Hungarian and Polish mathematicians, logicians etc. stayed home, and computer parts and applications carry names like Emlékezet or Wrzeszcz.
anthk•4mo ago
No Spanish fascism winning; thus, the Spanish left siding with the Republican France. Nazis gets lots of less support and they get crushed fast, many years earlier 1945. As for Spain itself, it wouldn't suffer a war, postwar and a National-Catholic ruralist shithole regime. No 15-20 years of backwardness compared to France/Europe until 1986 (Spain joining the pre-EU, and the postwar 1940-1950 Spain almost was on par on Europe at 1910-1920... if any, modulo the boost in the 60's because of Tourism), making itself a role model in South America. No polarized left and right in that continent, so they achieve European level standards of living. People merges Iberian humanism with the German engineering.

People like Torres Quevedo happen to exist everywhere because there are no anti-scientific people messing the education to the levels of something coming from the 18th century and before. I am no kidding. Pure creationism with Franco. By law. If you said something against religion, you were either fined, jailed or beaten up.

anthk•4mo ago
Spanish isn't much bigger...
fsckboy•4mo ago
>early character encodings like 7-bit ASCII

early character encoding was 6-bit ASCII, no lower case

gpvos•4mo ago
"as told by Rob Pike" is an essential part of the title, can that be added/reinstated?
YesThatTom2•4mo ago
I’ve always assumed this was the diner. I wonder if someone can confirm:

https://prestigediner.com/