Is There a Glacier in the 3000's Future?
September 26, 2012
By Brian Edminster
Applied Technologies
I heard about Amazon's Glacier service a couple of weeks ago, and was interested in it enough to pull the technical documentation for it. The front page for the service describes it so:
A storage service that provides secure and durable storage for data archiving and backup. In order to keep costs low, Amazon Glacier is optimized for data that is infrequently accessed and for which retrieval times of several hours are suitable. With Amazon Glacier, customers can reliably store large or small amounts of data for as little as $0.01 per gigabyte per month, a significant savings compared to on-premises solutions.
Why did I pull the techncial docs? I wanted to see if I could build a 'client' for MPE/iX. The answer looks to be yes, but it's not quite that simple. Just because something can be done doesn't make it a viable solution, at least not for every situation.
Earlier this week Infoworld posted an article that asserts cloud storage is the final nail in tape's coffin. The crux of the article is about the new Glacier service, and how the pricing structure enables the possibility to use the Cloud as your archiving store, rather than tape of one format or another. It would also eliminate the need to periodically refresh your backup medium (regardless of format, tape medium ages, even when not read/written. It must be read and transferred to new medium periodically; or more correctly, it must be if you intend to rely on it as a backup)
In analyzing Amazon Glacier's anticipated usage patterns, it looks like they're intending it for partial or application backups or archives, rather than massive full-backup archives to be used for any large full system recovery.
There's some kinks to work out though, that aren't talked about in the article. If a Glacier client were to be built on 'native hardware' MPE/iX instances, there are a number of things that need to be considered, in event of a system failure which would require a rebuild.
To begin with,
1) Is this a full system recovery, or just an application or two? Regardless, you'll need to have rebuilt your system to the point you can start retrieving the requisite backups over the network, and that will likely require a local CSLT and tape drive to match.
2) You need to be able to have sufficient space to retrieve the STD backups, so they can be restored.
3) Can you afford the downtime to pull many gigabytes of backup over the network, then restore it, then do whatever post-restore work is necessary — before your system can go live again?
Number 3 is the real kicker here. The Glacier service requires approximately four hours from the time you request your backup be made available, until you can start retrieving it, whether it's a 'all in one piece' backup (a 'VM' disk-image?), or if it's 'chunked' like an MPE/iX Store-to-Disk backup becomes when it's storing more than 2Gb of data. I've not seen any figures published that indicate that this 4hrs is a worst case, or not – but they make no guarantees of faster delivery.
Then there's the issue of transfer time from the Glacier service back to your 3000.
So tell me, just how fast is your inbound internet access, and even if you have fiber or broadband, how fast can your 3000 receive and put it away? The answer will probably depress you, because it certainly did me — unless you have a big N-Class system w/multiple 100Mb NICs and really fast disks.
The thing that's keeping me from burning the midnight oil on an MPE Glacier client isn't the four hour 'lead time' that Glacier wants. That's probably going to be the smallest elapsed time portion of the recovery effort. In fact, the Glacier User's Guide specifically talks about receiving data to archive – by shipping them your disk or other (yes, even thumb) drives. They recognize that even their networks have bandwidth limits. And if it's a limitation going in, it'll be a limitation coming out.
In short, in order to have recovery that's acceptably quick, you need to build your system to be fault tolerant — so it doesn't fail irreparably (requiring a complete system restore from backup) in the first place. That was not so much a revelation as it was a cold slap in the face.
Large systems, 3000s included, have become large enough that complete conventional 'bare metal' restores aren't practical as recovery mechanisms anymore. That's something I'm outlining for another part of my backup article series (Backup and Recovery Best Practices, or, Don't fail hard in the first place.)
Can something like Glacier work for a 'small to mid-sized' 3000s? I think the answer there is yes. But you'll have to find or make a Glacier client along with the integration and backup/recovery planning to go with it.
The only 'gotcha' is that conventional wisdom says the majority of the remaining 3000 sites generally only come in two sizes now: large, and small. And the small ones are the hard ones to find — and even harder to sell 'new stuff for their 3000' to.
I think that the IT environment that the 3000 lives in might change the dynamics of the above. For example, is there local Network Attached Storage that can be a backup destination? But then Glacier wouldn't directly be part of a 3000 backup tool, it'd be part of a tool for backing up the NAS.