10 bit bytes would give us 5-bit nibbles. That would be 0-9a-v digits, which seems a bit extreme.
When you think end-to-end for a whole system and do a cost-benefit analysis and find that skipping some letters helps, why wouldn't you do it?
But I'm guessing you have thought of this? Are you making a different argument? Does it survive contact with system-level thinking under a utilitarian calculus?
Designing good codes for people isn't just about reducing transcription errors in the abstract. It can have real-world impacts to businesses and lives.
Safety engineering is often considered boring until it is your tax money on the line or it hits close to home (e.g. the best friend of your sibling dies in a transportation-related accident.) For example, pointing and calling [1] is a simple habit that increases safety with only a small (even insignificant) time loss.
GI made 10-bit ROMs so that you wouldn't waste 37.5% of your ROM space storing those 6 reserved bits for every opcode. Storing your instructions in 10-bit ROM instead of 16-bit ROM meant that if you needed to store 16-bit data in your ROM you would have to store it in two parts. They had a special instruction that would handle that.
The Mattel Intellivision used a CP1610 and used the 10-bit ROM.
The term Intellivision programmers used for a 10-bit quantity was "decle". Half a decle was a "nickel".
What's the point?
It also works in the reverse direction too. E.g. knowing networking headers don't even care about byte alignment for sub fields (e.g. a VID is 10 bits because it's packed with a few other fields in 2 bytes) I wouldn't be surprised if IPv4 would have ended up being 3 byte addresses = 27 bits, instead of 4*9=36, since they were more worried with small packet overheads than matching specific word sizes in certain CPUs.
I don't think this is enough of a reason, though.
Thinking about the number of bits in the address is only one of the design parameters. The partitioning between network masks and host space is another design decision. The decision to reserve class D and class E space yet another. More room for hosts is good. More networks in the routing table is not.
Okay, so if v4 addresses were composed of four 9-bit bytes instead of four 8-bit octets, how would the early classful networks shaken out? It doesn't do a lot of good if a class C network is still defined by the last byte.
With so many huge changes like those the alternate history by today would be far diverged from this universe.
The knock-on effect of EBCDIC having room for accented characters would have been the U.S.A. not changing a lot of placenames when the federal government made the GNIS in the 1970s and 1980s, for example. MS-DOS might have ended up with a 255-character command-tail limit, meaning that possibly some historically important people would never have been motivated to learn the response file form of the Microsoft LINK command. People would not have hit a 256-character limit on path lengths on DOS+Windows.
Teletext would never have needed national variants, would have had different graphics, would have needed a higher bitrate, might have lasted longer, and people in the U.K. would have possibly never seen that dog on 4-Tel. Octal would have been more convenient than hexadecimal, and a lot of hexadecimal programming puns would never have been made. C-style programming languages might have had more punctuation to use for operators.
Ð or Ç could have been MS-DOS drive letters. Microsoft could have spelled its name with other characters, and we could all be today reminiscing about µs-dos. The ZX Spectrum could have been more like the Oric. The FAT12 filesystem format would never have happened. dBase 2 files would have had bigger fields. People could have put more things on their PATHs in DOS, and some historically important person would perhaps have never needed to learn how to write .BAT files and gone on to a career in computing.
The Domain Name System would have had a significantly different history, with longer label limits, more characters, and possibly case sensitivity if non-English letters with quirky capitalization rules had been common in SBCS in 1981. EDNS0 might never have happened or been wildly different. RGB 5-6-5 encoding would never have happened; and "true colour" might have ended up as a 12-12-12 format with nothing to spare for an alpha channel. 81-bit or 72-bit IEEE 754 floating point might have happened.
"Multimedia" and "Internet" keyboards would not have bumped up against a limit of 127 key scancodes, and there are a couple of luminaries known for explaining the gynmastics of PS/2 scancodes who would have not had to devote so much of their time to that, and possibly might not have ended up as luminaries at all. Bugs in several famous pieces of software that occurred after 49.7 days would have either occurred much sooner or much later.
Actual intelligence is needed for this sort of science fiction alternative history construction.
I don't know about that, it had room for lots of accented characters with code pages. If that went unused, it probably would have also gone unused in the 9 bit version.
> Actual intelligence is needed for this sort of science fiction alternative history construction.
Why? We're basically making a trivia quiz, that benefits memorization far more than intelligence. And you actively don't want to get into the weeds of chaos-theory consequences or you forget the article you're writing.
In any case, if we had chosen 27-bit addresses, we'd have hit exhaustion just a bit before the big telecom boom that built out most of the internet infrastructure that holds back transition today. Transitioning from 27-bit to I don't know 45-bit or 99-bit or whatever we'd choose next wouldn't be as hard as the IPv6 transition today.
Imagine an alternative world that used 7-bit bytes. In that world, Pavel Panchekha wrote a blog post titled "We'd be Better Off with 8-bit Bytes". It was so popular that most people in that world look up to us, the 8-bit-byters.
So to summarize, people that don't exist* are looking up to us now.
* in our universe at least (see Tegmark's Level III Multiverse): https://space.mit.edu/home/tegmark/crazy.html or Wikipedia
As far as ISPs competing on speeds in the mid 90s, for some reason it feels like historical retrospectives are always about ten years off.
Actually I doubt we'd have picked 27-bit addresses. That's about 134M addresses; that's less than the US population (it's about the number of households today?) and Europe was also relevant when IPv4 was being designed. In any case, if we had chosen 27-bit addresses, we'd have hit exhaustion just a bit before the big telecom boom, a lucky coincidence meaning the consumer internet would largely require another transition anyway. Transitioning from 27-bit to I don't know 45-bit or 99-bit or whatever we'd choose next wouldn't be as hard as the IPv6 transition today.
Interestingly, the N64 internally had 9 bit bytes, just accesses from the CPU ignored one of the bits. This wasn't a parity bit, but instead a true extra data bit that was used by the GPU.
64-bit pointers are pretty spacious and have "spare" bits for metadata (e.g. PAC, NaN-boxing). 72-bit pointers are even better I suppose, but their adoption would've come later.
C is good for portability to this kind of machine. You can have a 36 bit int (for instance), CHAR_BIT is defined as 9 and so on.
With a little bit of extra reasoning, you can make the code fit different machines sizes so that you use all the available bits.
Sometimes the latter is a win, but not if that is your default modus operandi.
Another issue is that machine-specific code that assumes compiler and machine characteristics often has outright undefined behavior, not making distinctions between "this type is guaranteed to be 32 bits" and "this type is guaranteed to wrap around to a negative value" or "if we shift this value 32 bits or more, we get zero so we are okay" and such.
There are programmers who are not stupid like this, but those are the ones who will tend to reach for portable coding.
int32_t main(int32_t argc, char **argv)?
How about struct tm? struct tm {$
int32_t tm_sec; /* Seconds (0-60) */$
int32_t tm_min; /* Minutes (0-59) */$
int32_t tm_hour; /* Hours (0-23) */$
int32_t tm_mday; /* Day of the month (1-31) */$
int32_t tm_mon; /* Month (0-11) */$
int32_t tm_year; /* Year - 1900 */$
int32_t tm_wday; /* Day of the week (0-6, Sunday = 0) */$
int32_t tm_yday; /* Day in the year (0-365, 1 Jan = 0) */$
int32_t tm_isdst; /* Daylight saving time */$
};
What for? Or do we "shrink wrap" every field to the smallest type? "uint8_t tm_hour"?Or we would have had 27 bit addresses and ran into problems sooner.
But on the other hand, if we had run out sooner, perhaps IPv4 wouldn't be as entrenched and people would've been more willing to switch. Maybe not, of course, but it's at least a possibility.
Or because IPv6 was not a simple "add more bits to address" but a much larger in-places-unwanted change.
They're almost always deployed though because people end up liking the ideas. They don't want to configure VRRP for gateway redundancy, they don't want a DHCP server for clients to be able to connect, they want to be able to use link-local addresses for certain application use cases, they want the random addresses for increased privacy, they want to dual stack for compatibility, etc. For the people that don't care they see people deploying all of this and think "oh damn, that's nuts", not realizing you can still just deploy it almost exactly the same as IPv4 with longer addresses if that's all you want.
Or they're deployed because it's difficult to use IPv6 without them, even if you want to. For instance, it's quite difficult to use Linux with IPv6 in a static configuration without any form of autodiscovery of addresses or routes; I've yet to achieve such a configuration. With IPv4, I can bring up the network in a tiny fraction of a second and have it work; with IPv6, the only successful configuration I've found takes many seconds to decide it has a working network, and sometimes flakes out entirely.
Challenge: boot up an AWS instance, configure networking using your preferred IP version, successfully make a connection to an external server using that version, and get a packet back, in under 500ms from the time your instance gets control, succeeding 50 times out of 50. Very doable with IPv4; I have yet to achieve that with IPv6.
> For instance, it's quite difficult to use Linux with IPv6 in a static configuration without any form of autodiscovery of addresses or routes; I've yet to achieve such a configuration. With IPv4, I can bring up the network in a tiny fraction of a second and have it work; with IPv6, the only successful configuration I've found takes many seconds to decide it has a working network, and sometimes flakes out entirely.
On IPv4 I assume you're doing something which boils down to (from whatever network configuration tool you use):
ip addr add 192.168.1.100/24 dev eth0
ip route add default via 192.168.1.1 dev eth0
Which maps directly to: ip -6 addr add 2001:db8:abcd:0012::1/64 dev eth0
ip -6 route add default via 2001:db8:abcd:0012::1 dev eth0
If you're also doing a static ARP to be "fully" static then you'll also have an additional config which boils down to something akin to: ip neigh add 192.168.1.50 lladdr aa:bb:cc:dd:ee:ff dev eth0 nud permanent
Which maps to this config to statically set the MAC instead of using ND: ip -6 neigh add 2001:db8:abcd:0012::2 lladdr aa:bb:cc:dd:ee:ff dev eth0 nud permanent
In both cases you either need to still locally respond to dynamic ARP/ND request or also statically configure the rest of the devices in the subnet (including the router) in a similar fashion, but there's not really much difference beyond the extra bits in the address.> Challenge: boot up an AWS instance, configure networking using your preferred IP version, successfully make a connection to an external server using that version, and get a packet back, in under 500ms from the time your instance gets control, succeeding 50 times out of 50. Very doable with IPv4; I have yet to achieve that with IPv6.
I have a strong aversion to AWS... but if there is anything more difficult about this for IPv6 than IPv4 then that's entirely on what AWS likes to do rather than what IPv6 requires. E.g. if they only give you a dynamic link local gateway it's because they just don't want you to use a public address as the static gateway, not because IPv6 said it had to be so by not supporting unicast gateways or something.
There's also nothing about IPv6 ND that would make it take longer to discover the gateway from a statically configured unicast address than IPv4 ARP would take, but AWS may be doing a lot of optional stuff beyond just being a dumb gateway in their IPv6 implementation - again, not because IPv6 itself said it should be so but because they want to do whatever they are doing.
To put it from another perspective: If the situation was reversed would you be blaming IPv4 and saying IPv4 should have been designed differently or would you just be asking why this guy from Android doesn't want to add DHCPv4 when DHCPv6 is supported? In both situations it's not IPv4/IPv6 to blame for the inconvenience, it's the guy taking advantage of the transition between protocols to do something stupid at the same time. No amount of changing the definition of IP is going to make them like DHCP, they'll always push some SLAAC-like address assignment onto users. The only reason they didn't for IPv4 was they came in after it was already the way instead of before networks were deployed and they could force it.
It's often very difficult to use IPv6 in practice, but not because IPv6 made it that way.
I do not want to be a "reasonably-skilled admin". Not my job nor desire. I want DHCP to work and NAT to exist which acts as a de-facto firewall and hides my internal network config from the outside world. All with zero or fewer clicks in my home router's config. With IPv4 this works. With IPv6 it does not. Simple choice for me then: find the IPv6 checkbox and turn it off, as usual.
As a technologist, growing up involves learning not to blame the consumer. They are not holding it wrong, you just designed it in a dumb way.
And no interoperability between the two without stateful network address translation.
Even if we could directly address every device on the internet, you'd still mostly want to run through a middle server anyway so you can send files and messages while the receiver device is sleeping, or to sync between multiple devices.
Pretty much the only loss was people self hosting servers, but as long as you aren't behind CGNAT you can just set up DDNS and be fine. Every ISP I've been with lets you opt out of CGNAT as well as pay for a static IP.
https://www.internetsociety.org/blog/2016/09/final-report-on...
Some more interesting history reading here:
I doubt we'd have picked 27-bit addresses. That's about 134M addresses; that's less than the US population (it's about the number of households today?) and Europe was also relevant when IPv4 was being designed. In any case, if we had chosen 27-bit addresses, we'd have hit exhaustion just a bit before the big telecom boom that built out most of the internet infrastructure that holds back transition today. Transitioning from 27-bit to I don't know 45-bit or 99-bit or whatever we'd choose next wouldn't be as hard as the IPv6 transition today.
A big part of the move to 8bit systems was that it allowed expanded text systems with letter casing, punctuation and various ASCII stuff.
We could move to the world of Fortran 36bit if really needed and solve all these problems while introducing a problem called Fortran.
Then they decided to abandon their indigenous technology in favour of copying Western designs
If you don't believe me, just ask Paula Bean.
https://scontent-lax3-2.xx.fbcdn.net/v/t39.30808-6/476277134...
You could have the equivalent of 45-bit numbers ( 44 + parity ). And you could have the operands of two 15 bit numbers and their result encoded in 9 quint-bits or quits. Go pro or go home.
"DEC's 36-bit computers were primarily the PDP-6 and PDP-10 families, including the DECSYSTEM-10 and DECSYSTEM-20. These machines were known for their use in university settings and for pioneering work in time-sharing operating systems. The PDP-10, in particular, was a popular choice for research and development, especially in the field of artificial intelligence. "
"Computers with 36-bit words included the MIT Lincoln Laboratory TX-2, the IBM 701/704/709/7090/7094, the UNIVAC 1103/1103A/1105 and 1100/2200 series, the General Electric GE-600/Honeywell 6000, the Digital Equipment Corporation PDP-6/PDP-10 (as used in the DECsystem-10/DECSYSTEM-20), and the Symbolics 3600 series.
Smaller machines like the PDP-1/PDP-9/PDP-15 used 18-bit words, so a double word was 36 bits.
Oh wait. Its already been done.
It had 512 72-bit registers and was very SIMD/VLIW, was probably the only machine ever with 81-bit instructions
1, 2, 3, 4, 5, 6, 10, 12, 15, 20, 30, and 60
Obviously not an emergent property but shows how these things were designed.
1m = 1e-10 times half-meridian from the North Pole to the equator, via Paris for a croissant, apparently.
So kind of a coincindence... But a very neat one. Meanwhile, ratio of adjacent Fibonacci numbers converves to some expression involving sqrt(5) which is approx 1.6
Not that these are exclusive, but I thought it's a rounding of 365.25 days a year stemming from Egypt. 360 is a pretty useful number of degrees for a starry sky that changes ince a night.
This was done for graphics reasons, native antialiasing if I understand it. The cpu can't use it. it still only sees 8-bit bytes.
https://www.youtube.com/watch?v=DotEVFFv-tk (Kaze Emanuar - The Nintendo 64 has more RAM than you think)
To summarize the relevant part of the video. The RDP wants to store pixel color in 18 bits 5 bits red 5 bits blue 5 bits green 3 bits triangle coverage it then uses this coverage information to calculate a primitive but fast antialiasing. so SGI went with two 9-bit bytes for each pixel and magic in the RDP(remember it's also the memory controller) so the cpu sees the 8-bit bytes it expects.
Memory on N64 is very weird it is basicly the same idea as PCIE but for the main memory. PCI big fat bus that is hard to speed up. PCIE small narrow super fast bus. So the cpu was clocked at 93 MHz but the memory was a 9-bit bus clocked at 250 MHz. They were hoping this super fast narrow memory would be enough for everyone but having the graphics card also be the memory controller proved to make the graphics very sensitive to memory load. to the point that the main thing that helps a n64 game get higher frame rate is to have the cpu do as few memory lookups as possible. which in practical terms means having it idle as much as possible. This has a strange side effect that while a common optimizing operation for most architectures is to trade calculation for memory(unroll loops, lookup tables...) on the N64 it can be the opposite. If you can make your code do more calculation with less memory you can utilize the cpu better because it is mostly sitting idle to give the RDP most of the memory bandwidth.
Some hardware circuits are a bit nicer with power-of-two sizes but I don't think it's a huge difference, and hardware has to include weird stuff like 24-bit and 53-bit multipliers for floating-point anyway (which in this alternate world would be probably 28-bit and 60-bit?). Not sure a few extra gates would be a dealbreaker.
[1] https://web.archive.org/web/20170404160423/http://archive.co...
[2] https://web.archive.org/web/20170404161611/http://archive.co...
One possibility would be bit-indexed addressing. For the 9-bit case, yes, such an index would need 4 bits. If one wanted to keep nice instruction set encoding nice and clean, that would result in an underutilized 4th bit. Coming up with a more complex encoding would cost silicon.
What other cases are you thinking of?
Self-correction: In what cases does going from 8-bit bytes to 9-bit bytes result in a penalty, and how much is it?
If accessing a bit is really accessing a larger block and throwing away most of it in every case, then the additional byte grouping isn't really helping much.
A one-bit wide bus ... er, wire, now, I guess ... Could work just fine, but now we are extremely limited with the number of operations achievable, as well as the amount of addressable data: an eight-bit address can now only reference a maximum of 32 bytes of data, which is so small as to be effectively useless.
It's an arbitrary grouping, and worse, it's rarely useful to think in terms of it. If you are optimizing access patterns, then you are thinking in terms of CPU words, cache line sizes, memory pages, and disk sectors. None of those are bytes.
In clothing stores, numerical clothes sizes have steadily grown a little larger.
The same make and model car/suv/pickup have steadily grown larger in stance.
I think what is needed is to silently add 9-bit bytes, but don't tell anyone.
Got to stop somewhere.
Note to the author, put this up front, so I know that you did the bare minimum and I can safely ignore this article for the slop it is.
At first I thought that was a nice way to handle credit, but on further thought I wonder if this is necessary because the base line assumption is that everyone is using LLMs to help them write.
Thank you to Android for mobile Internet connectivity, browsing, and typing.
Whoops ^ To be fair, technically, I also contain some factual errors, if you consider the rare genetic mutation or botched DNA transcription.
So far, I haven't found anything that I would consider to be a glaring factual error. What did I miss?
I'm not talking merely about a difference in imagination of how the past might have unfolded. If you view this as an alternative history, I think the author made a plausible case. Certainly not the only way; reasonable people can disagree.
https://en.wikipedia.org/wiki/Six-bit_character_code#DEC_SIX...
Notably the PDP 8 had 12 bit words (2x6) and the PDP 10 had 36 bit words (6x6)
Notably the PDP 10 had addressing modes where it could address a run of bits inside a word so it was adaptable to working with data from other systems. I've got some notes on a fantasy computer that has 48-bit words (fit inside a Javascript double!) and a mechanism like the PDP 10 where you can write "deep pointers" that have a bit offset and length that can even hang into the next word, with the length set to zero bits this could address UTF-8 character sequences. Think of a world where something like the PDP 10 inspired microcomputers, was used by people who used CJK characters and has a video system that would make the NeoGeo blush. Crazy I know.
The article says:
> A number of 70s computing systems had nine-bit bytes, most prominently the PDP-10
This is false. If you ask ChatGPT "Was the PDP-10 a 9 bit computer?" it says "Yes, the PDP-10 used a 36-bit word size, and it treated characters as 9-bit bytes."
But if you ask any other LLM or look it up on Wikipedia, you see that:
> Some aspects of the instruction set are unusual, most notably the byte instructions, which operate on bit fields of any size from 1 to 36 bits inclusive, according to the general definition of a byte as a contiguous sequence of a fixed number of bits.
-- https://en.wikipedia.org/wiki/PDP-10
So PDP-10 didn't have 9-bit bytes, but could support them. Characters were typically 6 bytes, but 7-bit and 9-bit characters were also sometimes used.
My first machines were the IBM 7044 (36-bit word) and the PDP-8 (12-bit word), and I must admit to a certain nostalgia for that style of machine (as well as the fact that a 36-bit word gives you some extra floating-point precision), but as others have pointed out, there are good reasons for power-of-2 byte and word sizes.
AI is not just wrong, its so stupendously stupidly wrong, you need to change its drool bib. The PDP-10 was a "six 6-bit ASCII characters, supporting the upper-case unaccented letters, digits, space, and most ASCII punctuation characters. It was used on the PDP-6 and PDP-10 under the name sixbit."
"So PDP-10 didn't have 9-bit bytes, but could support them. Characters were typically 6 bytes, but 7-bit and 9-bit characters were also sometimes used."
Note: "four 9-bit characters[1][2] (the Multics convention)." Confirmed.
</END!!!>
AFAIK only Multics used 4 9-byte characters on the PDP-10s; I believe 5 7-bit ASCII characters fairly common later on in the PDP7/10 lifetime.
A reminder of that past history is that in Internet standards documents, the word "octet" is used to unambiguously refer to an 8-bit byte. Also, "octet" is the French word for byte, so a "gigaoctet (Go)" is a gigabyte (GB) in English.
(Now, if only we could pin down the sizes of C/C++'s char/short/int/long/long-long integer types...)
Octad/octade was unambiguously about 8 bit bytes, but fell out of popular usage.
Because, I have a ten year old Dell laptop with 40GB of RAM, 16GB seems like an arbitrary limitation, an engineering compromise, or something like that.
I don’t see how it is a result of 8 bit bytes because 64bits has a lot of address space.
And because my laptop is running Windows 10 currently and ram Ubuntu before that, ordinary operating systems are sufficient.
—-
Also ECC RAM is 9 bits per byte.
We need to be better at estimating require sizes, not trying to trick ourselves into accomplishing that by slipping in an extra bit to our bytes.
What if instead of using single bytes, we used "doublebytes"?
8-bit software continues to work, while new 16-bit "doublebyte" software gets 256x the value capacity, instead of a meager 2x.
Nobody will ever need more byte space than that!
Without requiring any changes to CPU/GPU, RAM, SSD, Ethernet, WiFi ...
Magic. :)
The fact that Intel managed to push their shitty market segmentation strategy of only even supporting ECC RAM on servers has rather nefarious and long-lasting consequences.
Yeah, I wonder why. It's not IPv6's problem though, it's definitely Github's.
Anyway, it's not a good example, since IPv6 is vastly wider than 9-bit variant of IPv4 would have been.
And that 2^32 = 4B is similarly awkwardly not quite big enough for global things related to numbers of people, or for second-based timestamps.
But a 9th bit isn't going to solve those things either. The real problem is that powers-of-two-of-powers-of-two, where we jump from 256 to 65K to 4B to 18QN (quintillion), are just not fine-grained enough for efficient usage of space.
It might be nice if we could also have 2^12=4K, 2^24=16M, and 2^48=281T as more supported integer bit lengths used for storage both in memory and on disk. But, is it really worth the effort? Maybe in databases? Obviously 16M colors has a long history, but that's another example where color banding in gradients makes it clear where that hasn't been quite enough either.
I think it's actually better to run out of IPv4 addresses before the world is covered!
The later-adopting countries that can't get IPv4 addresses will just start with IPv6 from the beginning. This gives IPv6 more momentum. In big, expensive transitions, momentum is incredibly helpful because it eliminates that "is this transition even really happening?" collective self-doubt feeling. Individual members of the herd feel like the herd as a whole is moving, so they ought to move too.
It also means that funds available for initial deployment get spent on IPv6 infrastructure, not IPv4. If you try to transition after deployment, you've got a system that mostly works already and you need to cough up more money to change it. That's a hard sell in a lot of cases.
a little?
Author seems to be unaware that octet is etymologically linked to 8.
If you think there's some equally bad coincidence go ahead and tell me but no one has yet. I think I do a good job of that in the post. (Also it's amazing you and maybe everyone else assume I know nothing except what ChatGPT told me? There are no ads on the website, it's got my name face and job on it, etc. I stand by what I wrote.)
Seriously though we can always do more with one more bit. That doesn’t mean we’d be better off. 8-bits is a nice symmetry with powers of two
Yeah okay, this is completely pointless... so now we have to verify everything this guy published ?
9 bit bytes never made significant headway because a 12.5% overhead cost for any of these alternatives is pretty wild. But there are folks and were folks then who thought it was worth debating and there certainly are advantages to it, especially if you look at use beyond memory storage. (i.e. closer to "Harvard" architecture separation between data / code and security implications around strict separation of control / data in applications like networking.)
It's worth noting that SECDED ECC memory adds about a 20% overhead, though it can correct single bit flips whereas 9-bit bytes with a parity bit can only detect (but not correct) bit flips which makes it useful in theory but not very useful in practice.
FrankWilhoit•8h ago