(s)low budget drives: the future of archiving

Storage growth

Most of the data we collect and store on our computers eventually ends up in some sort of archive. I think we can all agree on that, right? Do we ever throw anything away? Well, some data doesn’t really make sense after a while and can (and will) be deleted, but a lot of data “might be useful” after some time and so we keep it. And don’t forget the tons of digital memories we create using photo and video cameras!  I estimate that I’m creating about 100 GB of digital photos and videos throughout the year and that’s increasing every year as well with the new cameras we’re using. More pixels, DSLR cameras, RAW photography and HD or even 4k HD videos are probably taking up most of the space we need extra each year.

Where do we store our data?

And where do we keep it? On tape? CDs or DVDs? Really? With disk prices dropping fast (again), we don’t bother buying other storage devices, we just keep the data on our cheap rotating disks. And I mean “environmentally expensive” rotating disks! Every rotating disk has an electrical motor to keep the platters rotating and even if it’s only 5400 RPM, it still produces some noise and heat and a rotating disk doesn’t live forever either. It has moving parts which can break, the heads can crash onto the platters and magnetic fields can slowly erase your precious data.

Increasing needs

So how much important data do I have? I guess it’s not even 1 TB just yet, but when I got that DSLR camera my needs are growing rapidly. It may be 1 TB at the moment, but by 2020 this number could easily be 10 TB and with kids growing up, creating their own digital footprint that 10 TB is just a wild guess. And what about 2030? Roughly 10 to 15 years ago 100 GB was quite a large drive and now 4 TB is large. We don’t even bother about 100 GB anymore. So if that trend’s also valid for our data growth, the 10 TB I just mentioned might be too low!

Store everything?

But do we really want to store everything we create? Well, if the price is right, why not? Sorting out that data takes way too much time and I’m guessing that most people don’t use a smart catalogue to categorize their data anyway, so keeping everything is the easy way out.

Redundancy and the safety of your data

But do we really want to store our data on a single device? What if it breaks? What if it gets stolen? What if your house burns down? So when I would store my data it’s going to be AT LEAST in some way of RAID. RAID5 is relatively cheaper than RAID1, but I’m concerned that the current technology and the growing capacities of these drives don’t give me enough security anymore. The estimated numbers of writes that can be done that eventually will produce bad data is near 100 TB. So for every 100 TB you write 1 unrecoverable error occurs. I will write another blog on the subject of MTBF (Mean Time Between Failure) and the safety of the current drive technology so keep it simple in this blog post. But what it comes down to is that we need some sort of redundancy to keep our data as safe as possible. The best way would be some form of automated replication to an off site location. This does sound expensive, but the current generation of popular NAS devices replication is a valid option! Simply buy another NAS that fulfills your archive storage needs and place it in your parents’ house for example. This 2nd NAS doesn’t have to be as large as the primary one, since you probably won’t be storing your whole working set of data over there, but that depends on how safe you want your data to be. For me the most important data are photos and videos.

So we just concluded that we need redundancy, so we need at least two drives and two locations, so that’s already 4 drives at the minimum. So that’s 4 x noise, 4 x heat, 4 x power. And even though a single drive may consume as little as 5 or 10 Watts, having 4 of these plus the NAS machine that you need to get to your data adds up.

Slow but reliable

So what about solid state storage? It has no moving parts, it’s also quiet, produces almost no heat at all and consumes very little power. But the current flash technologies are very fast and also very expensive. I’ve seen prices drop to about €0.60 per GB / $0.75 per GB and since you need 4 devices that’s still €2400 / $ 3000 for a single TB of redundant usable capacity. That’s a bit too expensive for me. Considering IOps per $$ SSD storage is the way to go, but in terms of $$ per GB flash is the most expensive technology to store your data… for now.

What if flash was made a lot cheaper? And by that I mean a lot. SSD does NOT need to be faster than a rotating disk, in fact, it could even be 100x slower than a rotating disk, since it’s only for archiving purposes! And what if slower means that the technology used doesn’t have to have the electrical tolerances the fast flash technology has nowadays and can be made in such a way that reliability increases by 100x or so? With reliability like that a device will actually last a lifetime! My life that is and that’s what counts. Suppose this 10 IOps flash device costs 100x less per GB than it would cost nowadays? This would mean that we can have a super slow, but still online, storage device, which would cost maybe $1 per 256 GB, or even less? Some indexing intelligence might be a good idea in order to be able to access that old data directly instead of having to search through all those TBs of data in your NAS at home, but that’s just an implementation issue that could be solved by the vendor that will bring you this archive monster. We don’t have to worry about keeping things simple and easy to find, just use this indexing engine which resides on somewhat faster storage like a regular SSD and we’re good to go 🙂

Storage of the future

Flash PCIeThese “(s)low-budget” drives may just be what we’ve been waiting for! We’ll be piggy back riding on Moore’s law and every 18 months we’ll get twice as much storage capacity in the same package. We don’t need the speed, we need space! Do we still need rotating disks? For now, yes. Rotating drives are not that expensive per GB and still provide an acceptable performance, where the current flash technology is moving from SSD-like devices to PCIe-alike devices to provide us with lightning fast data access for our “hot data”. Last week I even read an article about “Memory Channel Interface (MCI) Storage”. Technology sure is moving in the right direction.

Imagine having a NAS box in your house somewhere with room to house 32 of these 1TB “(s)low-budget drives” in a casing that only measures about 1 x 1 x 1 ft which also houses the indexing engine with a (fast) SSD providing an industry standard sharing technology like NFS or even CIFS or FTP. Without moving parts and an atom-like cpu the power supply would not need to provide dozens of Watts of power and maybe only 10 W is enough to operate this archive machine?

When?

I don’t think this technology will take long to appear on the market and I truly believe that many people will actually buy something like this if the price is right. My prediction? Within the next 10 years. For sure!!

  1. Early 2014 I bought a 240 GB Crucial M500 SSD for less than 100 euros / 125 Dollars and unfortunately prices haven’t really dropped until now (december 2014), but 1TB would be around 400 euros, so for the full redundant solution at 2 sites, this still means it would cost about 1600 euro for 2 sites, 1TB per site. Plus a NAS machine on each site.

Would you like to comment on this post?

Trackbacks and Pingbacks: