Gigabyte versus Gibibyte

Are the hard drive vendors screwing us?

the answer is no. At least when it comes to the number of bytes they promise you can store on their drives they’re not. Oh really?

In July 2012 I wrote a blog post on “saying what you mean to say“, so people cannot misinterpret what you’re trying to point out. Gigabyte, Gibibyte, Joules, Calorie, kilo Calorie, degrees Celsius, but not degrees kelvin (it’s just kelvin or capital K).

Living in caves

Now you probably think: “what on Earth are Gibibytes”, but make no mistakes here: since the year 1999 one GiB is exactly 1024 x 1024 x 1024 bytes. A long time ago, in the 80s, when we still lived in caves and were hunting wild animals for food, the IT industry considered it acceptable to use the 1 kB when in fact 1024 bytes were meant. The error was only 24 bytes on each 1000 bytes (2.4%) and back then it made the conversation a lot easier. Computers are using powers of 2 and 2 to the power of 10 is 1024, not 1000, but saying you needed 1024 bytes was a little tricky and 1 kilobyte made things a lot easier to pronounce.

Over the years technology improved and we got our Megabytes, Gigabytes and even Terabytes and all of sudden it dawned on people that the margin of error grew from 2.4% tot as much as 10% and when you’re selling storage bytes to customers it’s important to think this through: 1000 GB = 931 GiB and 1TB = 0.909 TiB.

Apples and pears

Are we loosing space in some mysterious way? Are we getting less than we think we should be getting? Well, perhaps: if you think you’re only getting 1000 apples when you pay for a million apples, but GB and GiB aren’t both apples, they’re apples and pears. Don’t compare the two! We’re not loosing anything at all, not before formatting and certainly not afterwards. Some exceptions apply when you’re using RAID controllers or file systems that handle a little extra error correcting code and file allocation tables also take up space, but you know they’re using space and the unformatted space is what the manufacturer sells you (what’s on the label). EMC for example uses 520 byte sectors instead of the 512 that’s normally used, but that’s for the extra ECC what I just mentioned, so this can be traced back!

So if you see that a Windows host tells you you only have 931 GB, it actually meant to say binary Gigabytes (GiB) and that’s exactly the same as 1 TB! 931 x 1024 x 1024 x 1024 = 1,000,000,000,000 bytes and to make things more clear: 1 TB = 0.909 TiB (1,000,000,000,000 / (1024 x 1024 x 1024 x 1024)), so that’s where the ±10% deviation comes from. There is no deviation!!!! It’s all a matter of getting your prefixes right!

So why does an operating system say GB, when GiB is meant? Why do peopke say “degrees kelvin” (instead of just “kelvin”) or you weigh this many “kilo” (instead of “kilogram”)? It’s called ease of use. When talking about weight there can be no mistake since 1 kilo is 1000 and since we’re talking weight only grams can be meant, so kilo sort of equals kilogram, but when comparing Gibibyte and Gigabyte these are two completely different things! We say 1 kHz when we mean 1000 Hertz, right? We say 1 kg when we mean 1000 gram, right? We say 1 kCal we mean 1 big calory or 1000 calories (small calories). So why not use GiB when we meant to use the binary amount?

Use the same language otherwise we’ll get misunderstandings

Summary: watch what you’re using in word and writing to make sure we all use the same language. This prevents us from nasty misunderstandings.

As mentioned in my previous post back in July 2012 the IEC wrote an article on the “new” binary prefixes back in 1999 !!!!!! –> http://www.iec.ch/tcnews/archives/pdf/tclet6.pdf <– please read it and start using the correct prefixes!

Example: a 100 Mbps link is 100,000,000 bits per second = 100 million devided by 8 to make it bytes instead of bits = 12.5 MB.

If you want to refer to this with binary prefixes: binary megabytes = 95,37 Mib per second or 11.92 MiB per second <– mark the extra “i”!!! We’re deviding by (1024 * 1024), so it becomes binary! Actually we’re using a base 2 system, so it’s binary and binary uses binary prefixes with the extra “i” and a capital.

Would you like to comment on this post?