In my last blog, I quickly mentioned that tape will prove the best way to manage the coming zettabyte apocalypse. Before the groaning starts, a few points need to be made.
Tape Plays Nice
First and foremost, tape is a technology that has been relegated to the history books many times, but it continues to enjoy fairly widespread adoption. Two of the three “industrial farmers” of the Cloud world, Google and Microsoft, admit to using tape in their storage infrastructure and to having big plans for tape storage going forward. (Amazon remains close-lipped about its tape plans, though nearly every expert one speaks with cannot see how the behemoth cloud service provider will be able to reach its goals without it.) So, increasingly, when you talk cloud storage, you will likely be talking about tape.
Tape is Resilient
The problems that most anti-tapers cite when deriding the technology are mostly misguided. Contrary to the hyperbolistic (and now recanted) statements of analysts in the late 1990s, tape is not more prone to failure than other storage media and is, in fact, among the most resilient. With a non-recoverable bit error rate that is an order of magnitude less than SATA hard disks and on par with the best flash memory in the market today, tape is very reliable. And because of improvements in substrate materials and coatings, the durability of tape is about 30 years – much more than flash, disk or optical.
Tape is Growing in Capacity
Thirdly, tape is growing its capacity by leaps and bounds. Owing to Barium Ferrite (BaFe) coatings, which replace metal particle tape coatings of the past and enable a variation of perpendicular magnetic recording on tape media that rivals PMR on disk, tape has a long runway of capacity improvements ahead. The Linear Tape Open (LTO) roadmap currently goes out to 120TB of compressed capacity per tape. This is actually a modest projection since demonstrations have been made of BaFe cartridges with 220TB raw (uncompressed) capacity within the last year, courtesy of Fujifilm and IBM.
Tape is Enhanced with LTFS
Finally, tape technology has been further enhanced by the Linear Tape File System (LTFS) technology from IBM, which has been standardized by the Storage Networking Industry Association. LTFS provides a way to bridge transparently the file systems (and object storage systems) of the flash and disk world to tape, enabling files and objects to be stored to and retrieved from tape in much the same way as they are from a USB drive or disk drive. This usability improvement mitigates the retraining requirements that may be created by reintroducing tape into a central storage role. It also helps to eliminate the need for problematic backup software which has long been the source of the acid indigestion that operators blamed on tape technology to begin with.
Tape will be Required
So, if zettabytes are to be stored cost-effectively, tape will be required. This simple conclusion has been reached by leading cloud vendors and by a growing number of enterprises, especially those considering hybrid cloud architectures (“build your base, buy your burst” applied to storage and processing technology). Only tape can be manufactured in sufficient quantity to handle a 40 to 60 ZB spike in storage capacity demand by 2020.
Looking to the Future
Of course, other tricks and techniques will be tried over the next four years to blunt the tsunami of data. We are already hearing about de-duplication in production data environments (as opposed to backups) and compression still has many adherents. And, of course, the best way to bend the cost curve in storage is to cull all of the unnecessary data from the storage you already have in order to free up space for the new bits. That will require a combination of data hygiene and archiving – which remain today the undiscovered country of data management. While it is doubtless that up to 70% of the currently deployed disk storage capacity could be reclaimed by sorting out the data junk drawer, this effort will not free up enough space to hold a billion terabytes of new data in any case.
Bottom line: it is time to begin a sober consideration of data storage and to begin defining an “all of the above” strategy in hybrid storage infrastructure design. Otherwise, how can we ensure that all of our data will be securely managed and protected?