Edit 25: In 1972 IBM started selling the IBM 3333 magnetic disk drive. This product catalog [0] from 1979 shows them marketing the corresponding disks as "100 million bytes" or "200 million bytes" (3336 mdl 1 and 3336 mdl 11, respectively). By 1984, those same disks were marketed in the "IBM Input/Output Device Summary"[1] (which seems to have been made available to interested parties) as "100MB" and "200MB"
So this ambiguity is documented at least back to 1984, by the pre-eminent computer company of the time.
0: (PDF page 281) "IBM 3330 DISK STORAGE" http://electronicsandbooks.com/edt/manual/Hardware/I/IBM%20w...
1: (PDF page 38, labeled page 2-7, Fig 2-4) http://electronicsandbooks.com/edt/manual/Hardware/I/IBM%20w...
Also, hats off to http://electronicsandbooks.com/ for keeping such an incredible record available for the internet to browse.
-------
The article presents wishful thinking. The wish is for "kilobyte" to have one meaning. For the majority of its existence, it had only one meaning - 1024 bytes. Now it has an ambiguous meaning. People wish for an unambiguous term for 1000 bits, however that word does not exist. People also might wish that others use kibibyte any time they reference 1024 bytes, but that is also wishful thinking.
The author's wishful thinking is falsely presented as fact.
I think kilobyte was the wrong word to ever use for 1024 bytes, and I'd love to go back in time to tell computer scientists that they needed to invent a new prefix to mean "1,024" / "2^10" of something, which kilo- never meant before kilobit / kilobyte were invented. Kibi- is fine, the phonetics sound slightly silly to native English speakers, but the 'bi' indicates binary and I think that's reasonable.
I'm just not going to fool myself with wishful thinking. If, in arrogance or self-righteousness, one simply assumes that every time they see "kilobyte" it means 1,000 bytes - then they will make many, many failures. We will always have to take care to verify whether "kilobyte" means 1,000 or 1,024 bytes before implementing something which relies on that for correctness.
0
There was always a confusion about whether a kilobyte was 1000 or 1024 bytes. Early diskettes always used 1000, only when the 8 bit home computer era started was the 1024 convention firmly established.
Before that it made no sense to talk about kilo as 1024. Earlier computers measured space in records and words, and I guess you can see how in 1960, no one would use kilo to mean 1024 for a 13 bit computer with 40 byte records. A kiloword was, naturally, 1000 words, so why would a kilobyte be 1024?
1024 bearing near ubiquitous was only the case in the 90s or so - except for drive manufacturing and signal processing. Binary prefixes didn't invent the confusion, they were a partial solution. As you point out, while it's possible to clearly indicate binary prefixes, we have no unambiguous notation for decimal bytes.
Even worse, the 3.5" HD floppy disk format used a confusing combination of the two. Its true capacity (when formatted as FAT12) is 1,474,560 bytes. Divide that by 1024 and you get 1440KB; divide that by 1000 and you get the oft-quoted (and often printed on the disk itself) "1.44MB", which is inaccurate no matter how you look at it.
I wonder if there's a wikipedia article listing these...
Similarly, the 4104 chip was a "4kb x 1 bit" RAM chip and stored 4096 bits. You'd see this in the whole 41xx series, and beyond.
I was going to say that what it could address and what they called what it could address is an important distinction, but found this fun ad from 1976[1].
"16K Bytes of RAM Memory, expandable to 60K Bytes", "4K Bytes of ROM/RAM Monitor software", seems pretty unambiguous that you're correct.
Interestingly wikipedia at least implies the IBM System 360 popularized the base-2 prefixes[2], citing their 1964 documentation, but I can't find any use of it in there for the main core storage docs they cite[3]. Amusingly the only use of "kb" I can find in the pdf is for data rate off magnetic tape, which is explicitly defined as "kb = thousands of bytes per second", and the only reference to "kilo-" is for "kilobaud", which would have again been base-10. If we give them the benefit of the doubt on this, presumably it was from later System 360 publications where they would have had enough storage to need prefixes to describe it.
[1] https://commons.wikimedia.org/wiki/File:Zilog_Z-80_Microproc...
[2] https://en.wikipedia.org/wiki/Byte#Units_based_on_powers_of_...
[3] http://www.bitsavers.org/pdf/ibm/360/systemSummary/A22-6810-...
Example: in 1972, DEC PDP 11/40 handbook [0] said on first page: "16-bit word (two 8-bit bytes), direct addressing of 32K 16-bit words or 64K 8-bit bytes (K = 1024)". Same with Intel - in 1977 [1], they proudly said "Static 1K RAMs" on the first page.
[0] https://pdos.csail.mit.edu/6.828/2005/readings/pdp11-40.pdf
[1] https://deramp.com/downloads/mfe_archive/050-Component%20Spe...
More like late 60s. In fact, in the 70s and 80s, I remember the storage vendors being excoriated for "lying" by following the SI standard.
There were two proposals to fix things in the late 60s, by Donald Morrison and Donald Knuth. Neither were accepted.
Another article suggesting we just roll over and accept the decimal versions is here:
https://cacm.acm.org/opinion/si-and-binary-prefixes-clearing...
This article helpfully explains that decimal KB has been "standard" since the very late 90s.
But when such an august personality as Donald Knuth declares the proposal DOA, I have no heartburn using binary KB.
That's the microcomputer era that has defined the vast majority of our relationship with computers.
IMO, having lived through this era, the only people pushing 1,000 byte kilobytes were storage manufacturers, because it allows them to bump their numbers up.
https://www.latimes.com/archives/la-xpm-2007-nov-03-fi-seaga...
In fact, they practically say the same exact thing you have said: In a nutshell, base-10 prefixes were used for base-2 numbers, and now it's hard to undo that standard in practice. They didn't say anything about making assumptions. The only difference is that that the author wants to keep trying, and you don't think it's possible? Which is perfectly fine. It's just not as dramatic as your tone implies.
Which is the reality. "kilobyte" means "1000 bytes". There's no possible discussion over this fact.
Many people have been using it wrong for decades, but its literal value did not change.
You are free to intend only one meaning in your own communication, but you may sometimes find yourself being misunderstood: that, too, is reality.
You can say that one meaning is more correct than the other, but that doesn't vanish the other meaning from existence.
Now, it depends.
In fact, this is the only case I can think of where that has ever happened.
https://www-cs-faculty.stanford.edu/~knuth/news99.html
And he was right.
Context is important.
"K" is an excellent prefix for 1024 bytes when working with small computers, and a metric shit ton of time has been saved by standardizing on that.
When you get to bigger units, marketing intervenes, and, as other commenters have pointed out, we have the storage standard of MB == 1000 * 1024.
But why is that? Certainly it's because of the marketing, but also it's because KB has been standardized for bytes.
> Which is the reality. "kilobyte" means "1000 bytes". There's no possible discussion over this fact.
You couldn't be more wrong. Absolutely nobody talks about 8K bytes of memory and means 8000.
You need character to admit that. I bow to you.
Which makes it really @#ing annoying when you have things like "I want to transmit 8 gigabytes (meaning gibibytes, 2*30) over a 1 gigabit/s link, how long will it take?". Welcome to every networking class in the 90s.
We should continue moving towards a world where 2*k prefixes have separate names and we use SI prefixes only for their precise base-10 meanings. The past is polluted but we hopefully have hundreds of years ahead of us to do things better.
* Yeah, I read the article. Regardless of the IEC's noble attempt, in all my years of working with people and computers I've never heard anyone actually pronounce MiB (or write it out in full) as "mebibyte".
It doesn't matter. "kilo" means 1000. People are free to use it wrong if they wish.
What the hell is a "kibibyte"? Sounds like a brand of dog food.
Why don’t kilobyte continue to mean 1024 and introduce kilodebyte to mean 1000. Byte, to me implies a binary number system, and if you want to introduce a new nomenclature to reduce confusion, give the new one a new name and let the older of more prevalent one in its domain keep the old one…
Many things acquire domain specific nuanced meaning ..
"in binary computing traditionally prefix + byte implied binary number quantities."
There are no bytes involved in Hz or FLOPs.
Because it never did!
"I will not sacrifice my dignity. We've made too many compromises already; too many retreats. They invade our space and we fall back. They assimilate entire worlds with awkward pronunciations. Not again. The line must be drawn here! This far, no further! And I will make them pay for what they've done to the kilobyte!"
"I bought a two tib SSD."
"I just want to serve five pibs."
You can use `--si` for fake, 1000-byte kilobytes - trying it it seems weird that these are reported with a lowercase 'k' but 'M' and so on remain uppercase.
For SI units, the abbreviations are defined, so a lowercase k for kilo and uppercase M for mega is correct. Lower case m is milli, c is centi, d is deci. Uppercase G is giga, T is tera and so on.
https://en.wikipedia.org/wiki/International_System_of_Units#...
You don't need to show your ignorance this clearly!
I gave some examples in my post https://blog.zorinaq.com/decimal-prefixes-are-more-common-th...
Agreed. For the naysayers out there, consider these problems:
* You have 1 "MB" of RAM on a 1 MHz system bus which can transfer 1 byte per clock cycle. How many seconds does it take to read the entire memory?
* You have 128 "GB" of RAM and you have an empty 128 GB SSD. Can you successfully hibernate the computer system by storing all of RAM on the SSD?
* My camera shoots 6000×4000 pixels = exactly 24 megapixels. If you assume RGB24 color (3 bytes per pixel), how many MB of RAM or disk space does it take to store one raw bitmap image matrix without headers?
The SI definitions are correct: kilo- always means a thousand, mega- always means a million, et cetera. The computer industry abused these definitions because 1000 is close to 1024, creating endless confusion. It is a idiotic act of self-harm when one "megahertz" of clock speed is not the same mega- as one "megabyte" of RAM. IEC 60027 prefixes are correct: there is no ambiguity when kibi- (Ki) is defined as 1024, and it can coexist beside kilo- meaning 1000.
The whole point of the metric system is to create universal units whose meanings don't change depending on context. Having kilo- be overloaded (like method overloading) to mean 1000 and 1024 violates this principle.
If you want to wade in the bad old world of context-dependent units, look no further than traditional measures. International mile or nautical mile? Pound avoirdupois or Troy pound? Pound-force or pound-mass? US gallon or UK gallon? US shoe size for children, women, or men? Short ton or long ton? Did you know that just a few centuries ago, every town had a different definition of a foot and pound, making trade needlessly complicated and inviting open scams and frauds?
32 Gb ram chip = 4 GiB of RAM.
They didn't abuse the definitions. It's simply the result of dealing with pins, wires, and bits. For your problems, for example, you won't ever have a system with 1 "MB" of RAM where that's 1,000,000 bytes. The 8086 processor had 20 address lines, 2^20, that's 1,048,576 bytes for 1MB. SI units make no sense for computers.
The only problem is unscrupulous hardware vendors using SI units on computers to sell you less capacity but advertise more.
I disagreed strongly - I think X-per-second should be decimal, to correspond to Hertz. But for quantity, binary seems better. (modern CS papers tend to use MiB, GiB etc. as abbreviations for the binary units)
Fun fact - for a long time consumer SSDs had roughly 7.37% over-provisioning, because that's what you get when you put X GB (binary) of raw flash into a box, and advertise it as X GB (decimal) of usable storage. (probably a bit less, as a few blocks of the X binary GB of flash would probably be DOA) With TLC, QLC, and SLC-mode caching in modern drives the numbers aren't as simple anymore, though.
It would be nice to have a different standard for decimal vs. binary kilobytes.
But if Don Knuth thinks that the "international standard" naming for binary kilobytes is dead on arrival, who am I to argue?
Because Windows, and only Windows, shows it this way. It is official and documented: https://devblogs.microsoft.com/oldnewthing/20090611-00/?p=17...
> Explorer is just following existing practice. Everybody (to within experimental error) refers to 1024 bytes as a kilobyte, not a kibibyte. If Explorer were to switch to the term kibibyte, it would merely be showing users information in a form they cannot understand, and for what purpose? So you can feel superior because you know what that term means and other people don’t.
The author doesn’t actually answer their question, unless I missed something?
They go on to make a few more observations, and say finally only that the current different definitions are sometimes confusing, to non experts.
I don’t see much of an argument here for changing anything. Some non experts experience minor confusion about two things that are different, did I miss something bigger in this?
Approximating metric prefixing with kibi, Mibi, Gibi... is confusing because it doesn't make sense semantically. There is nothing base-10-ish about it.
I propose some naming based on shift distance, derived from the latin iterativum.
* 2^10, the kibibyte, is a deci (shifted) byte, or just a 'deci'
* 2^20, the mibibyte, is a vici (shifted) byte, or a 'vici'
* 2^30, the gibibyte, is a trici (shifted) byte, or a 'trici'
I mean, we really only need to think in bytes for memory addressing, right? The base doesn't matter much, if we were talking exabytes, does it?
jachee•1h ago
It's the same reason—for pure marketing purposes—that screens are measured diagonally.
dr_zoidberg•1h ago
quotemstr•1h ago
direwolf20•1h ago
nerdsniper•1h ago
It's easy to find some that are marketed as 500GB and have 500x10^9 bytes [0]. But all the NVMe's that I can find that are marketed as 512GB have 512x10^9 bytes[1], neither 500x10^9 bytes nor 2^39 bytes. I cannot find any that are labeled "1TB" and actually have 1 Tebibyte. Even "960GB" enterprise SSD's are measured in base-10 gigabytes[2].
0: https://download.semiconductor.samsung.com/resources/data-sh...
1: https://download.semiconductor.samsung.com/resources/data-sh...
2: https://image.semiconductor.samsung.com/resources/data-sheet...
(Why are these all Samsung? Because I couldn't find any other datasheets that explicitly call out how they define a GB/TB)