Organisations implement an email archiving solution such as Enterprise Vault for many reasons. There are also many ideas of what email archiving solutions will enable them to do. People consider space saving, legal discovery of items, searching for end-users, offline access and many other aspects of the overall solution. In this article I will help you explore one of the components of an archiving solution - saving space. We will explore three ways that email archiving using Enterprise Vault can do this.
Shortcuts, Stubs and Placeholders
In a typical Enterprise Vault installation revolving around Exchange email archiving and maybe some File Server archiving too, once an item has been archived it will be reduced to a stub, shortcut, placeholder (different names from different people for different aspects of archiving). The idea is that the stub is much smaller than the original item, literally a shortcut to that item. Enterprise Vault uses the information in that shortcut to locate the full item and return it to the client application.
You can easily imagine that mails that are 50-250 Kb in size, or ones with 3-10 Mb attachments are quite common. These shrink down to about 12 Kb. That's a huge space saving! The actual size that you end up with varies depending on what you want to include in the shortcut, Enterprise Vault gives you a lot of flexibility when it comes to this area. With file system archiving it's better still because the placeholder is tiny: just a few kilobytes (and isn't customisable)
This isn't going to affect every single email (or file) immediately. As with the policy of what to include in a shortcut, careful planning must be performed in order to work out the time-lines of what you'll archive. End-users might not accept archiving everything after 1 day!
There are several tools available for trying to visualise or calculate the effect of archiving on mailboxes and file servers. One which I've used before on mailbox is Mailbox Analysis from QUADROtech:
There are also scripts available on the Symantec Connect Forums that do similar work.
A big selling point of many archiving solutions is that they can be used to 'finally' eliminate the proliferation of PST files in an organisation. PSTs are not particularly efficient at storing data, and people will regularly make several copies of the files themselves. This means that space is taken up on the users workstation by the PST file, and it is likely that a copy (or two) ends up on a least one of your file servers as an end-user driven 'backup' of their data. This means your backup schedules have to backup that additional data.
Enterprise Vault has several built-in mechanisms for helping collect and ingest PST files from across the organisation, and there are several third party solutions which are commonly used such as:
PSTAccelerator (from http://www.pstaccelerator.com/)
PST FlightDeck (from http://www.quadrotech-it.com/products/pst-flightdeck/)
The space saving gained from this aspect of Enterprise Vault archiving isn't for free though. The real data from those PST files does have to get stored within Enterprise Vault, and indexed too. When planning for the elimination of PST files don't make the mistake of calculating the total size of all the PSTs and thinking that is the total space you will save - it really won't be like that.
De-Duplication and Sharing
In my opinion Exchange 2010 should not have removed the technology around sharing of items. Single Instancing is a great benefit, and Enterprise Vault has actually advanced it's abilities in this area over recent years. Items can be shared within a Vault Store, or within a Vault Store Group, or not shared at all. Sharing within a Vault Store Group means you can still have logical separation of data, that might make business processes flow better, but the data contained within the silos is handled so much better.
For example you might create a Vault Store for your Marketing people, and another Vault Store for your HR people. With Enterprise Vault's OSIS technology even if this resulting data is stored on WORM devices it can still be single instanced.
This sharing of data isn't just done at the email level though, it's one better than that. The email is deconstructed into chunks, such as the message body, and each attachments. This elements can be shared by Enterprise Vault. For example if I send 3 separate emails to different people, with different text in the body of the message, but with the same PDF file attachment, Enterprise Vault can use it's sharing technology to store only one copy of that PDF file.
Sharing of data between the data repositories also helps with your storage, because, in the end you are storing less data.
How much of a saving each of the three areas will give you will vary from environment to environment. In some situations the needs of the business will dictate that shortcuts need to be the 'full message' and therefore in that situation there will be little space saving. What you should do today is work through the three highlighted areas and see whether or not you will benefit from space saving using Enterprise Vault.