We'd be better off with 9-bit bytes

https://pavpanchekha.com/blog/9bit.html

20•luu•2h ago

Comments

FrankWilhoit•2h ago

That's what the PDP-10 community was saying decades ago.

Keyframe•1h ago

Yeah, but hear me out - 10-bit bytes!

Waterluvian•33m ago

Uh oh. Looks like humanity has been bitten by the bit byte bug.

pdpi•31m ago

One of the nice features of 8 bit bytes is being able to break them into two hex nibbles. 9 bits breaks that, though you could do three octal digits instead I suppose.

10 bit bytes would give us 5-bit nibbles. That would be 0-9a-v digits, which seems a bit extreme.

pratyahava•26m ago

Crockford base32 would be great. it is 0–9, A–Z minus I, L, O, U.

pdpi•11m ago

The moment you feel the need to skip letters due to propensity for errors should also be the moment you realise you're doing something wrong, though. It's kind of fine if you want a case insensitive encoding scheme, but it's kind of nasty for human-first purposes (e.g. in source code).

pratyahava•29m ago

i am fascinated with the idea of 10-bit bytes since years ago. i asked chatgpt "below is a piece of an article about 9-bit bytes, create the same but for 10-bit bytes" and gave a piece, the reply is below.

IPv4: Everyone knows the story: IPv4 had 32-bit addresses, so about 4 billion total.³³ Less due to various reserved subnets. That's not enough in a world with 8 billion humans, and that's led to NATs, more active network middleware, and the impossibly glacial pace of IPv6 roll-out. It's 2025 and Github—Github!—doesn't support IPv6. But in a world with 10-bit bytes IPv4 would have had 40-bit addresses, about 1 trillion total. That would be more than enough right now, and likely sufficient well into the 22nd century.⁴⁴ In our timeline, exhaustion hit in 2011, when demand was doubling every five years. 256× more addresses gets us to 2065 projecting linearly, and probably later with slowing growth. When exhaustion does set in, it would plausibly happen in a world where address demand has stabilized, and light market forces or reallocation would suffice—no need for NAT spaghetti or painful protocol transitions.

UNIX time: In our timeline, 32-bit UNIX timestamps run out in 2038, so again all software has to painfully transition to larger, 64-bit structures. Equivalent 40-bit timestamps last until year 34,857, so absolutely no hurry. Negative timestamps would reach back to year -34,818, easily covering everything from the birth of agriculture to the last Ice Age to the time Neanderthals still roamed Europe.⁵⁵ And yes, probably long enough to support most science fiction timelines without breaking a sweat.

iosjunkie•22m ago

No! No, no, not 10! He said 9. Nobody's comin' up with 10. Who processing with 10 bits? What’s the extra bit for? You’re just wastin’ electrons.

titzer•10m ago

10 bit bytes would be awesome! Think of 20 bit microcontrollers and 40 bit workstations. 40 bits makes 5 byte words, that'd be rad. Also, CPUs could support "legacy" 32 bit integers and use a full 8 bits for tags, which are useful for implementing dynamic languages.

zamadatix•55m ago

Because we have 8 bit bytes we are familiar with the famous or obvious cases multiples-of-8-bits ran out, and those cases sound a lot better with 12.5% extra bits. What's harder to see in this kind of thought experiment is what the famously obvious cases multiples-of-9-bits ran out would have been. The article starts to think about some of these towards the end, but it's hard as it's not immediately obvious how many others there might be (or, alternatively, why it'd be significantly different total number of issues than 8 bit bytes had). ChatGPT particularly isn't going to have a ton of training data about the problems with 9 bit multiples running out to hand feed you.

It also works in the reverse direction too. E.g. knowing networking headers don't even care about byte alignment for sub fields (e.g. a VID is 10 bits because it's packed with a few other fields in 2 bytes) I wouldn't be surprised if IPv4 would have ended up being 3 byte addresses = 27 bits, instead of 4*9=36, since they were more worried with small packet overheads than matching specific word sizes in certain CPUs.

MangoToupe•54m ago

Maybe if we worked with 7-bit bytes folks would be more grateful.

folsom•54m ago

I don't know what if we ended up with a 27 bit address space?

As far as ISPs competing on speeds in the mid 90s, for some reason it feels like historical retrospectives are always about ten years off.

jayd16•51m ago

I guess nibbles would be 3 bits and you'd 3 per byte?

monocasa•50m ago

Ohh, and then we could write the digits in octal.

Interestingly, the N64 internally had 9 bit bytes, just accesses from the CPU ignored one of the bits. This wasn't a parity bit, but instead a true extra data bit that was used by the GPU.

ethan_smith•32m ago

The N64's Reality Display Processor actually used that 9th bit as a coverage mask for antialiasing, allowing per-pixel alpha blending without additional memory lookups.

kazinator•48m ago

36 bit addresses would be better than 32, but I like being able to store a 64 bit double or pointer or integer in a word using NaN tagging (subject to the limitation that only 48 bits of the pointer are significant).

Retr0id•48m ago

Aside from memory limits, one of the problems with 32-bit pointers is that ASLR is weakened as a security mitigation - there's simply fewer bits left to randomise. A 36-bit address space doesn't improve on this much.

64-bit pointers are pretty spacious and have "spare" bits for metadata (e.g. PAC, NaN-boxing). 72-bit pointers are even better I suppose, but their adoption would've come later.

kazinator•45m ago

Problem is, not only did we have decades of C code that unnecessarily assumed 8/16/32, this all-the-world-is-a-VAX view is now baked into newer languages.

C is good for portability to this kind of machine. You can have a 36 bit int (for instance), CHAR_BIT is defined as 9 and so on.

With a little bit of extra reasoning, you can make the code fit different machines sizes so that you use all the available bits.

pratyahava•20m ago

was that assumption in C code really unnecessary? i suppose it made many things much easier.

0cf8612b2e1e•12m ago

Now a C++ proposal to define a byte as 8 bits

https://isocpp.org/files/papers/P3477R1.html

TruffleLabs•41m ago

PDP-8 has a 12-bit word size

bawolff•41m ago

> But in a world with 9-bit bytes IPv4 would have had 36-bit addresses, about 64 billion total.

Or we would have had 27 bit addresses and ran into problems sooner.

bigstrat2003•17m ago

That might've been better, actually. The author makes the mistake of "more time would've made this better", but we've had plenty of time to transition to IPv6. People simply don't because they are lazy and IPv4 works for them. More time wouldn't help that, any more than a procrastinating student benefits when the deadline for a paper gets extended.

But on the other hand, if we had run out sooner, perhaps IPv4 wouldn't be as entrenched and people would've been more willing to switch. Maybe not, of course, but it's at least a possibility.

dmitrygr•2m ago

> simply don't because they are lazy and IPv4 works for them

Or because IPv6 was not a simple "add more bits to address" but a much larger in-places-unwanted change.

SlowTao•33m ago

Can you imagine the argument for 8bit bytes if we still lived in the original 6bit world of the 1950s?

A big part of the move to 8bit systems was that it allowed expanded text systems with letter casing, punctuation and various ASCII stuff.

We could move to the world of Fortran 36bit if really needed and solve all these problems while introducing a problem called Fortran.

LegionMammal978•22m ago

There was already more than enough space for characters with 12-bit systems like the PDP-8. If anything, the convergence on 8-bit words just made it more efficient to use 7-bit codepages like ASCII.

duskwuff•31m ago

Non-power-of-2 sizes are awkward from a hardware perspective. A lot of designs for e.g. optimized multipliers depend on the operands being divisible into halves; that doesn't work with units of 9 bits. It's also nice to be able to describe a bit position using a fixed number of bits (e.g. 0-7 in 3 bits, 0-31 in 5 bits, 0-63 in 6 bits), e.g. to represent a number of bitwise shift operations, or to select a bit from a byte; this also falls apart with 9, where you'd have to use four bits and have a bunch of invalid values.

smallstepforman•29m ago

The elephant in the room nobody talks about is silicon cost (wires, gates, multiplexirs, AND and OR gates etc). With a 4th lane, you may as well go straight to 16 bits to a byte.

pratyahava•23m ago

This must be the real reason of using 8-bit. But then why did they make 9-bit machine instead of 16-bit?

alphazard•25m ago

When you stop to think about it, it really doesn't make sense to have memory addresses map to 8-bit values, instead of bits directly. Storage, memory, and CPUs all deal with larger blocks of bits, which have names like "pages" and "sectors" and "words" depending on the context.

If accessing a bit is really accessing a larger block and throwing away most of it in every case, then the additional byte grouping isn't really helping much.

SpaceNoodled•14m ago

It makes sense for the address to map to a value the same width as the data bus.

A one-bit wide bus ... er, wire, now, I guess ... Could work just fine, but now we are extremely limited with the number of operations achievable, as well as the amount of addressable data: an eight-bit address can now only reference a maximum of 32 bytes of data, which is so small as to be effectively useless.

m463•23m ago

We have already solved this problem many times.

In clothing stores, numerical clothes sizes have steadily grown a little larger.

The same make and model car/suv/pickup have steadily grown larger in stance.

I think what is needed is to silently add 9-bit bytes, but don't tell anyone.

also: https://imgs.xkcd.com/comics/standards_2x.png

nottorp•22m ago

Of course, if that happens we'll get an article demanding 10-bit bytes.

Got to stop somewhere.

NelsonMinar•19m ago

This is ignoring the natural fact that we have 8 bit bytes because programmers have 8 fingers.

mkl•14m ago

Most have 10. That's the reason we use base 10 for numbers, even though 12 would make a lot of things easier: https://en.wikipedia.org/wiki/Duodecimal

skort•17m ago

> Thank you to GPT 4o and o4 for discussions, research, and drafting.

Note to the author, put this up front, so I know that you did the bare minimum and I can safely ignore this article for the slop it is.

js8•16m ago

I have thought for fun about a little RISC microcomputer with 6-bit bytes, and 4-byte words (12 MiB of addressable RAM). I think 6-bit bytes would have been great at a point in history, and in something crazy fun like Minecraft. (It's actually interesting question, if we were to design early microprocessors with today's knowledge of HW methods, things like RISC, caches or pipelining, what would we do differently?)

Low dose of lithium reverses Alzheimer's symptoms in mice

How Not to Pick Friends in Prison

Taking the bitter lesson to heart for speech-to-speech models

Panoramax – Open Street Imagery

Coding error blamed after parts of Constitution disappear from US website

Cracking the Family Codes

Myths about metabolism could be holding you back

Giant satellite VHF antenna for space-based air traffic service

SFT Is Bad RL

Distracting monkey application only partial mitigation

TUI Version of Dmidecode Tool

uBlock Origin Lite for Safari

Version of OpenAIs's new open source 20B model, optimized to run on Mac (MLX)

Trump to put 100% tariff on computers chips

'The Devil Wears Rothko' Review: Victimhood and Vanity

Ask HN: Whats your wishlist from your authentication system

Decisions in a Short Doc

GUI based scheduler and automation tool for user actions simulations

What Happens When You Start Panic Ing [video]

I Replaced Myself with AI for a Year and Nobody Noticed [video]

Show HN: Before You Buy: An AI Assistant That Makes Shopping Less Risky

Ask HN: Did the Amiga's 640x400 flicker kill its potential for business use?

The new shape of Mixxx 3.0 – take part in the future of Open Source DJing

Measuring the effectiveness of software development tools and practices

Ask HN: Who do you trust for product reviews?

On agency

Explicit tail call optimization in Rust on nightly merged

Analyzing Control Flow More Like a Human [video]

Seagate Expansion 28TB External Hard Drive and the HAMR HDD Within Mini-Review

Richard Stallman: GitHub was so bad for free software (2019)