Harvesting Volumes with Precision in EVDC
When you add Primary Volumes in EVDC by default it will harvest that entire volume. In most cases this might be fine but sometimes you might want to target specific folders on a File Server or a specific mailbox or two in Exchange. There are a couple ways you can achieve this but the best way in my opinion is to use Included directories in the Show advanced options section of the Volume definition page.
Once you expand Show advanced options you will see Include directories. This field allows you to use Regular Expression syntax to tell EVDC what exactly you would like to harvest. This is very handy if you are targeting an Exchange server with 2,500 mailboxes and you only need to index 10 of them.
Lets work with that Example and walk through how to set that up.
First you would need to log into EVDC with an account with admin permissions. Then follow these steps:
1. Select the data sources tab from the menu, then click Volumes in the specify volumes section.
2. Click Add primary volumes, this will open the Primary volume definition form.
3. Since we are adding Exchange choose Exchange from the Server type drop down list. Then fill in the appropriate details.
ServerType:The type of server you will be targeting for collection
Version:The version of the Exchange server you are targeting
Server:The name of the Exchange server
MailboxServer:Only needed if using a Client Access Server (CAS)
ActiveDirectoryServer:Only used if needing to specify a Global Catalog Server (GC)
Connect as:Enter the name of an account with the appropriate Exchange permissions
Volume:For Exchange this is a Friendly name. You can enter any name here but what you enter here will be the name for this volume in the EVDC interface.
Initial directory:If targeting a specific mailbox enter the SMTP address. This is case sensitive for the primary SMTP address of the user. I don’t recommend using this because you cannot change it after you save the volume.
Virtual root:This auto-populates depending what version of Exchange you choose. Leave the default.
Index options:Select you appropriate indexing level.
Validation: I always recommend leaving this check. This will allow EVDC to verify the Connect as user to make sure the appropriate permissions are met.
Included directories: This is where we enter the mailboxes we want to target. In my example I want Mike Smith, Tom Jones, and Phil More's mailboxes indexed so I enter the first part of their primary smtp address like this. ^mike.smith|^tom.jones|^phil.more
Constraints: This limits the number of connections EVDC will make to the data source. Typically you don’t need to constrain the Harvest task but you could use this whilst you test the load put on the data source during the harvest process.
4. Click OK to save your changes. The data source will now be available to add to a Harvest job and only the mailboxes that meet the regular expressions will be harvested.
Now that we have set up the volume let’s talk a bit more about the Regular Expression syntax. You will notice I used a up carrot ^ at the beginning of the line. Then I separated them with the pipe | sign, the | works as the ‘OR’ operator. Be sure you do not put a | at the end of your list or it will do all directories!
If you just so happen to need to add a mailbox after you have added the volume you simple choose Edit from the primary volume list and add the mailbox to the end of the Included directories list.
As I mentioned earlier, this is a big advantage over using Initial Directory, once that is set and you save the volume you cannot edit it.
Another operator to be aware of is the $ sign. It indicates the end of the line, so if you were working with a file share and you put in ^User$ you would only get folders named User. If, however, you were to just enter User in the Included directories line you would get SiteUsers, User_Groups, and User. Basically, anything that has User in the name will be harvested.
So to recap, the operators to use are:
^ = beginning of line
$ = end of line
| = OR
I hope this helps understand how you can harvest your volumes with a great degree of precision using these operators to limit the scope of the harvest.