Amazon Glacier, the archive cloud storage available in Amazon Web Services (AWS), is a great backup tape replacement, but too many IT professionals often assume it is a true enterprise-grade cloud archive solution.
Veritas receives regular interest from IT professionals looking for alternatives to Amazon Glacier. They’re looking for something that offers the low cost and convenience of public cloud, and is also fully-managed, effortless to adopt, and meets their data governance and retrieval needs.
Before storing copious amounts of your precious data in the cloud, you need a clear picture of how retrieval will work.
With Amazon Glacier, retrieval is anything but simple. Here’s what Amazon says about downloading your data from Glacier:
Amazon Glacier provides a management console, which you can use to create and delete vaults. However, you cannot download archives from Amazon Glacier by using the management console. To download data, such as photos, videos, and other documents, you must either use the AWS CLI or write code to make requests, by using either the REST API directly or by using the AWS SDKs.
You also have to store your data in sets. When retrieving data, you have to request a retrieval job on a set and then – if you’re only interested in retrieving specific data within a set – you need to programmatically specify a retrieval range to avoid data transfer costs on the full set.
There’s more complexity:
Most Amazon Glacier [retrieval] jobs take about four hours to complete. Amazon Glacier must complete a job before you can get its output. A job will not expire for at least 24 hours after completion, which means you can download the output within the 24-hour period after the job is completed.
Glacier’s complex and slow retrieval makes it difficult to see how it could be anything more than a tape backup replacement.
Veritas NetBackup SaaS Protection makes it easy to get your data out of the cloud.
It includes a self-service user access portal which you can provide to an unlimited number of your users (with instant access to browse, download, search, and even share cloud-archived content).
For bulk extraction and recovery scenarios, NetBackup SaaS Protection also includes a downloadable export utility for privileged users. The export utility provides a simple interface allowing an authenticated and authorized user to recover data on-demand, either downloading the folders and content to a new folder or merging things back to their original place. It’s handy for bulk downloading certain folders, entire folder structures, and discovery cases you might have created in the NetBackup SaaS Protection Admin Portal.
Unlike Glacier, Veritas doesn’t place hurdles between you and your data. There’s no arbitrary process to retrieve your data. On one hand, you can optionally provide knowledge workers with self-service access to their cloud archives. And administrators have an intuitive user interface for bulk downloading specific content on-demand. Retrieval from NetBackup SaaS Protection always happens immediately.
Archiving typically isn’t necessarily something businesses want to do – it’s something they have to do. It’s also a complex undertaking. Amazon’s cloud services provide a platform for archiving, but not an actual solution. As we see from Amazon’s positioning of its archiving solution:
Amazon Web Services offers a complete set of cloud storage services for archiving. You can choose Amazon Glacier for affordable, non-time-sensitive cloud storage, or Amazon Simple Storage Service (S3) for faster storage, depending on your needs. With AWS Storage Gateway and our solution provider ecosystem, you can build a comprehensive, storage solution. (Emphasis added)
To achieve enterprise-grade cloud archiving with Amazon solutions, you need to deal with Glacier’s complex retrieval paradigm. Your solution might involve deployment and configuration of the AWS Storage Gateway. You might need partner products such as StorReduce if things like deduplication are important to you. If you want secure access to the data, you’ll need to go deeper in the Amazon cloud stack with AWS Identity and Access Management (IAM), with custom development if you want synchronized authorization. If you need compliance storage, you need to dig into Amazon Glacier Vault Lock. And if you need to search your data in the cloud, wrap your mind around Amazon CloudSearch.
If you’re busy with other projects and just want cloud archiving to be simple, Amazon Glacier isn’t it. Leveraging Amazon’s cloud storage services for archiving in a meaningful way for your business requires expertise, time, and careful planning. And even if you do manage to connect all the dots, perhaps involving AWS partner solutions, you’re likely still missing key features to support your enterprise archive use cases.
The Veritas Approach: Fully=Managed Cloud Archive
Our approach connects all the dots for you. The complete solution environment consisting of infrastructure and software is fully-managed for you, whether you deploy into your own account in Microsoft Azure or under Veritas.
How does your data get to the cloud?
Both the Amazon Storage Gateway and Veritas’ virtual cloud gateway are run on a simple virtual machine.
The AWS Storage Gateway presents a drive on your network which locally caches active data while floating everything else up to the cloud. Each instance of the AWS storage gateway that you run has a monthly fee along with its own storage and data transfer costs.
Here’s the rub: Not only does Amazon’s gateway add infrastructure for you to manage with its requirement to carve out on-premises storage capacity, but to use the gateway’s storage volume you also have a data migration problem to solve. It’s less than ideal for IT managers who want to simplify their lives and avoid the brouhaha of a never-ending data migration.
Unlike Amazon, NetBackup SaaS Protection’s gateway is included in the cloud archive subscription at no extra cost, and instead of being a storage mount point, Veritas’s gateway cloud-extends your existing storage investments with automated data migration policies.
The Veritas Approach: The Virtual Cloud Gateway
Our virtual cloud gateway is an engine that runs archiving and data protection policies against your existing storage investments and – based on rules you set – synchronizes and migrates targeted data to the cloud. We do this with permissions and folders intact so that your data maintains the same security and structure in the cloud.
In our gateway, you define the data that syncs/moves to the cloud using rules based on file type, last accessed, size, etc. Your rules run automatically based on a schedule, and you can control their impact on your network during peak and off-peak hours with bandwidth throttling.
If you have multiple facilities with storage, you can run an instance of the virtual gateway in each location to locally sync and migrate your distributed storage environment. NetBackup SaaS Protection scales out across Azure regions if you need to maintain data sovereignty for certain locations. But when you archive to a common target in our cloud archive you get deduplication across all the storage serviced by multiple gateways.
NetBackup SaaS Protection’s cloud gateway approach is easier because it handles the data migration problem for you, without adding infrastructure. NetBackup SaaS Protection essentially cloud-enables your existing storage infrastructure as opposed to showing up as more storage for you to manage.
Reducing storage on-premises isn’t something Glacier can do for you with in any automated fashion.
In NetBackup SaaS Protection’s virtual cloud gateway, optional policies let you delete or create stubs out of the originals after archiving to the cloud. This frees up space in your storage arrays, allowing you to simplify backup and defer spending on new storage.
By using these stubs – which are essentially just pointers to a file’s new location in the cloud – you can move a significant amount of data from your on-premises file servers to the cloud without your end-users being affected or even aware of it. All the folders they’re used to browsing will still appear, and your users can still open files from those folders.
Let’s imagine that you have 40 TB of files across all your file servers. Taking a deeper look, you realize that 75% of the files haven’t been accessed at all in the past 90 days. (Many studies say it’s typically ~90% rather then 75%, but let’s look at it conservatively.)
This means that your file servers are holding 30 TB worth of data that no one is actually using. Cloud storage is less expensive per GB than on-premises storage, so if you had NetBackup SaaS Protection migrate those unused files to a cloud archive leaving stubs in their place, you’d free up 30 TB of storage space. This would let you delay upgrading your on-premises servers add more storage capacity, saving you money – all without having prevented your users from accessing those files if needed.
In my opinion, if you’re archiving enterprise data to the cloud, you need to think ahead about cloud data management. Cloud storage shouldn’t be a dangerous black hole. Like black hole Gargantua in the movie ‘Interstellar’, you don’t want to find yourself floating about like spaceman Matthew McConaughey in the Tesseract trying to find a way to deal with your archive’s data gravity.
Amazon Glacier is not data-aware storage in the cloud. It’s just low-cost storage. Glacier has no built-in data governance or analytics features. In my opinion, this alone places Glacier in the category of tape-replacement backup solution, not an enterprise archiving solution. How you’ll one day manage data in Glacier is left for you to sort out.
NetBackup SaaS Protection’s Approach: Data Awareness with Integrated Data Governance Applications
NetBackup SaaS Protection’s data-aware cloud storage is an advantage for everyday IT operations, security, compliance, and litigation scenarios.
(What you’re about to read is ahead-of-the-curve functionality in cloud storage…)
Veritas’ scale-out cloud storage fabric runs a storage analytics and activity intelligence engine. We call it ‘data-aware storage’ for short. It works quietly behind the scenes to maintain insights and surface perspectives for admin users that make audits, investigations, data management, and monitoring hassle-free.
Data-aware storage shows you what’s in your archive, how your policies are doing, what data is sensitive or on legal hold, and how users are interacting with content. You can build and track your own queries to easily generate the insights you need.
Each of the following use cases can be stimulated and optimized by data-aware insights:
The policy engine in NetBackup SaaS Protection that links its query-ready cloud storage fabric with its data management controls runs in near real-time – there’s nothing you need to tinker with or learn administratively to realize this value.
Cloud storage, like a traditional file server, can be like a black hole for your data. Businesses need insight into what they are storing. Managing retention – actually deleting things in a defensible manner – is a concern for decision-makers. Your data in the cloud is not immune to needing legal holds, tagging, and auditing. If your data rests in a black hole, you’re sitting on a powder keg of risk.
With most archiving there is often the desire to have self-service user access. This can involve different approaches:
Amazon Glacier doesn’t handle self-service user access. You need to develop your own solution or add third-party products to the mix. This will involve becoming familiar with AWS Identity and Access Management (IAM), plus finding/developing a way to handle the authorization problem.
NetBackup SaaS Protection’s Approach: Self-service Web Access and Optional Policy-based Stubbing
Recall our earlier discussion of how the NetBackup SaaS Protection gateway captures the permissions when writing data or syncing changes to the cloud. When accessing their content in the cloud archive, Veritas ensures users see only their content according to the synchronized permissions on folders and items. Access rights just transparently remain in sync.
Inactive data sitting in user file shares and home directories is a great example of data you might want to stub. NetBackup SaaS Protection’s virtual cloud gateway includes an agentless stubbing approach that is controllable with policies. For example, you might create stubs out of PDF files that are larger than 25 MB and haven’t been accessed in over one year. When a user clicks the shortcut, they see that the data has been cloud archived and they have the option to instantly retrieve it or view it in the Web portal.
Businesses with compliance or litigation activity need search capabilities for the discoverability of their data. Search is also a great feature inside of self-service user access.
Amazon CloudSearch is a search service in the AWS cloud. Both Amazon CloudSearch and NetBackup SaaS Protection’s search-as-a-service are fully-managed for you in the cloud, including setup, maintenance, configuration, monitoring, and management.
But unlike Veritas’ search, Amazon CloudSearch requires you to roll up your sleeves to make meaningful use of it. For example:
Remember that Amazon Storage Gateway doesn’t synchronize folders and permissions when it sends data to the cloud, so you won’t natively be able to scope searches by folders, users, groups, or data owners if you use CloudSearch to index your data in AWS S3 or Glacier.
NetBackup SaaS Protection’s Approach: Customizable Scoped Indexing
Veritas provides an on-demand search service designed to mitigate search costs and accelerate discovery methods. It is fully-integrated to support self-service user access and discovery scenarios.
Because full content indexing can be a compute and storage intensive operation, Veritas uniquely enables its customers to specify rules that control the indexing scope within their archive.
For instance, an indexing policy might specify that only certain users’ data needs to be keyword searchable. Instead of keyword indexing the entire corpus of data in a large archive, Veritas lets you specify the targeted portions of your archive that need to be keyword searchable. At any time, you can modify this scope as needed.
NetBackup SaaS Protection’s search service is enterprise-friendly with the following features:
This is by no means a knock against Amazon’s solutions. I am just calling it like it is. Amazon Glacier is a platform cloud service that is just one piece of the enterprise archiving puzzle.
Infrastructure and Platform-as-a-Service (IaaS and PaaS) – whether Amazon Web Services or Microsoft Azure – are awesome cloud computing solutions, but true enterprise-ready cloud archiving requires a Software-as-a-Service (SaaS) solution higher in the stack to make everything work hassle-free the way you expect.