How Retention values for Archived items from Enterprise Vault are applied, identified and enforced in a Centera Storage Device

Article: 100022765
Last Published: 2021-10-04
Ratings: 0 0
Product(s): Enterprise Vault

Problem

At times, it is required to understand and identify the Retention value information in order to understand the expiry process.  This article represents how Retention Category values for Archived items from Enterprise Vault (EV) are applied, identified and enforced in a Centera Storage Device using EMC Governance or Compliance Edition Plus Models.

Solution

The below information shows how to identify individual archived items in Enterprise Vault (EV), check the assigned retention and confirm the retention on an EMC Centera Stored C-CLIP.

Since EMC Centera is a Compliancy Storage Device, dependant on the Model type, any item written to the device cannot be modified after initially storing the data.

There are 3 General Model types for EMC Centera:
  • Basic Edition
  • Governance Edition (GE)
  • Compliance Edition Plus (CE+)

Identification of Model Type:
Enable Enterprise Vault's Dtrace on the StorageOnlineOpns* process and restart the Enterprise Vault Storage Service. (See note below) 

Example:
CVaultStoreEMCCentera::SetPool -- Compliance Device mode:
  • Compliance Device mode: True 

The Centera Storage is a Governance Edition or Compliance Edition Plus model

  • Compliance Device mode: False

The Centera Storage is a Basic Edition model.

Note: The "CVaultStoreEMCCentera::SetPool" function may also be called by the StorageCrawler, StorageDelete or StorageFileWatch process , dependent on the request being sent to the device.

The primary differences between the Basic Edition and Governance/Compliance Editions are the following:
  • Basic Edition leverages the Application based Retention periods exclusively to control deletion requests.
  • Compliance/Governance Edition has Storage based Retention as well as Application based Retention to control deletion requests.

Details on deletion requests performed against Centera:
  • Enterprise Vault will not perform Privileged Delete requests.
  • If not leveraging Collections, delete requests are performed against the EMC Centera Replica Nodes and then Primary Nodes individually when a delete request is performed.
  • A successful delete response must be received via Centera from both the Replica and Primary Nodes for EV to identify a delete as successful.
  • With EV Centera leveraging Collections, a delete of a Clip and the items associated with the Clip are not performed until all items are marked as deleted which are referenced within the Clip.
 

Storage and SQL
 
In order to understand EMC Storage and Retention, it is required to understand how EV stores archived items. In EV 2007 and earlier, when EV utilizes NTFS for a Storage location, each archived item is stored as a *.DVS file. When Collections are enabled for NTFS, the EV Collections process ( StorageFileWatch) will scan for a selected age range and collects the DVS files to CAB files to lower the total number of files on storage. The purpose being, to improve the performance of file system backups of the Storage device by decreasing the number of files required to be accessed. See the following article for further details on NTFS Collections  https://www.veritas.com/docs/000035212.
 
Note: EV 8.0 introduced a new storage paradigm for NTFS which separates the item into a DVS, DVSSP and DVSCC file.

For an EMC Centera device, a Centera Device has a proprietary storage method. Files are 'presented' to the Storage API (Provided by EMC) by the Application (Enterprise Vault) and the Storage API communicates to the Storage Device. When the items are stored on a Centera Device, these objects are referenced as 'C-Clips' for Content Addressing on the device. A C-Clip, or commonly referenced as 'Clip', is assigned to the data of the object stored. Centera separates the data object, into individual Binary Large Objects (BLOBs) for Single Instance Storage, which are individually referenced in the Clip. A request to retrieve or access an archived item by the application will include the application item reference (Saveset ID) and the Centera Clip, which is passed through the Storage API. The Storage API will leverage the Clip identifier to located the data content requested. A 'BLOB read' request is then performed to collect the BLOB data assigned to the Clip and File allocation. This BLOB data is then converted and presented to the application in its native format.
 
Note: When storing multiple items that are assigned to different Retention Categories, EV storage will separate the items based upon the Retention Category, thus ensuring that all items in a Clip are assigned to the same "retention .period" value.
 
Without Collections being enabled, the archived item and Centera Clip are a 1 to 1 reference (1 Clip = 1 Archived item*) and are presented directly to the EMC Storage API via the StorageArchive process.
 
Note: A single archived item may still be separated into multiple 'blobs', based on the nature and complexity of the item, for sharing purposes.
 
 
When Centera has Collections enabled, this process utilizes an NTFS 'staging area'; a temporary location where items are copied from the original source. The items are then presented in batches to the Centera Storage API by the StorageFileWatch process, where the API will store the group of archived items (Savesets) to be referenced in a single C-Clip.

Notes:
   a. This process is the same with EV 2007 and earlier as well as EV 8.0 and above.
   b. Items are placed into the 'staging' area by the StorageArchive process, the StorageFileWatch process will present these items to the EMC Storage API.
   c. DVS Files are placed in the 'staging' area utilizing the entire Saveset identity, which includes the unique IdTransaction.
 
Identifying items stored, the associated retention categories and collection identities:
 
1. Saveset Table:
When items are stored in EV, each email is assigned a 'Saveset' identity.
 
2. Collections Table and SavesetStore Table:
When items are 'Collected', the Collection reference is stored in the EnterpriseVault Vault Store Database (DB) in the Collection Table as the RelativeFileName value. When Collections are not enabled for Centera, the Clip ID reference is stored in the SavesetStore Table as the StorageIdentifier value. See the following article ( https://www.veritas.com/docs/000029802) on identifying the Clip Id associated with a Saveset ID.
 
3. RetentionCategoryIdentity value and RetentionCategoryEntry Table.
Each Saveset is assigned a Retention Category when stored. This can be observed in the Saveset Table as the RetentionCategoryIdentity value. This value refers to the RetentionCategoryEntry Table within the EnterpriseVaultDirectory DB (Ex. RetentionPeriodUnits of "3" equals "Years"), and directly matches the EnterpriseVaultDirectory.RetentionCategoryEntry.RetentionCategoryIdentity value.
 
Example: 
Opening the EnterpriseVaultDirectory DB and viewing the RetentionCategoryEntry table, if the " Business" RetentionCategoryName has a RetentionCategoryIdentity of "2". From this, it is possible to run the following SQL script to display all Savesets in a given VaultStore DB that are assigned to the " Business" Retention Category:
 
USE <VaultStoreDBName>
SELECT * FROM Saveset
WHERE RetentionCategoryIdentity = '2'

(" <VaultStoreDBName>" = The Vault Store Database name - do not include the < >)
 
Alternate Query:
The following SQL query will return a count list of all the Savesets, grouped by their Retention Category value:
 
SELECT EnterpriseVaultDirectory.dbo.RetentionCategoryEntry.RetentionCategoryIdentity, EnterpriseVaultDirectory.dbo.RetentionCategoryEntry.RetentionCategoryName, COUNT(IdTransaction) AS 'No. of Savesets'
FROM Saveset
INNER JOIN EnterpriseVaultDirectory.dbo.RetentionCategoryEntry
ON Saveset.RetentionCategoryIdentity = EnterpriseVaultDirectory.dbo.RetentionCategoryEntry.RetentionCategoryIdentity
GROUP BY EnterpriseVaultDirectory.dbo.RetentionCategoryEntry.RetentionCategoryIdentity, EnterpriseVaultDirectory.dbo.RetentionCategoryEntry.RetentionCategoryName
ORDER BY EnterpriseVaultDirectory.dbo.RetentionCategoryEntry.RetentionCategoryIdentity, EnterpriseVaultDirectory.dbo.RetentionCategoryEntry.RetentionCategoryName

There are 2 primary rules regarding Retention categories:
 
A. If a new Retention Category is created to take the place of an existing Default Retention Category, only items archived after the creation of the Category will be assigned to it, since pre-existing savesets are assigned to the previous RetentionCategoryIdentity.

B. If an existing entry in the RetentioncategoryEntry Table is modified (RetentionPeriod and RetentionPeriodUnits), or the existing Retention Category is modified within the Vault Admin Console, all Savesets assigned to that RetentionCategoryIdentity will be affected.
 
4. Centera CDF (Clip Descriptor File) xml.
When Centera Stores items referenced to C-Clips, it also generates a 'Metabase' file for the Clip. This descriptor file references the 'retention.period' value and the SavesetId value(s) associated with the Clip. The CDF file can be 'requested' from the Storage Vender (EMC). Normally, in order to request the CDF, the Clip ID must be known and the file must be saved as XML. This file gives an overall view of how the Clip was stored. 

The " retention.period" value is assigned to the Clip at the time of archival and cannot be modified after the item has been stored. The value is in seconds so a proper conversion is required (Ex. the Value of "80438400" would need to be divided by 60 to convert it to minutes, then divided again by 60 to convert it to hours, then divided by 24 to convert it to days then divided by @365 to get the approximate number of years which would be about 2.5 years).

Notes:
    a. If Centera is in Basic Mode, the Clip "retention.period" will not be enforced.
    b. If Centera is in Basic Mode, the Clip "retention.period" will be "0" when the Retention Tab setting is set to "Never" under the Vault Store Partition properties.
    c.  The "retention.period" value of the Clip will be populated with the EV Retention Period when the Retention Tab setting is set to "For all Centera models" under the Vault Store Partition properties.
    d. If the Retention Category has "Retain items forever" selected, the Clip "retention.period" will be "0", since 'forever' is not a numerical value.
 
How to identify the items and determine the associated retention values:
 
1. If referencing Journaled items, it is not possible to use the process in article ( https://www.veritas.com/docs/000008476) which utilizes Shortcut information to attain the Saveset id of the item. Alternatively, it is possible to perform a search of the Journal Archive in question for the date range in question (Ex. 01/01/2004 - 12/30/2004) using the Advanced Browser Search (https://<EVServername>/EnterpriseVault/Search.asp?Advanced=3).

     a. Select an item and view it within the context of Search, highlight the Internet Explorer (IE) Address and copy the address into a text editor.

     b. From the content of the address, locate the value " SavesetId=" and copy only that value to a different line to separate it from the address.

2. Once the SavesetId value is identified, either via Shortcut or Indexable content, continue through Article ( https://www.veritas.com/docs/000008476) and perform the following SQL query against the Vault Store DB:
SELECT * FROM SAVESET
WHERE IDTRANSACTION LIKE '%FIRST SEVEN CHARACTERS OF TRANSACTION ID%'
 
3. Record the RetentionCategoryIdentity.
4. Utilize the following SQL Query to Identify the RetentionCategoryName.
 
USE EnterpriseVaultDirectory
SELECT RetentionCategoryIdentity, RetentionCategoryName
FROM RetentionCategoryEntry
WHERE RetentionCategoryIdentity = 'VALUE'
 
Note: Alter VALUE to the numerical value of the RetentionCategoryIdentity recorded in step C. 

5. Open the Vault Admin Console and locate the Retention Category name to identify the Retention Value.
 
Example of a scenario where items fail to expire due to a Retention mismatch:
 
  • Factors known:
    • Item has archive date of January 2005.
    • The Retention is set in EV for 5 Years.
    • Expiry is set to Archive Date.
    • Current Date : January 2010

Based on the above values, when Storage Expiry runs, EV will attempt to delete this item, however the items fail to be deleted.

Why was the item not expired?:

1. Confirm if the Vault Store in question utilizes Collections

a. Open the Vault Admin Console (VAC)
b. Expand Vault Store Groups and expand the associated Vault Store Group
c. Select the Vault Store.
d. Right-click on the Partition value on right side and select Properties.
e. Select the Collections Tab.

2. Utilize Article (https://www.veritas.com/docs/000029802)  to  locate the Clip Id associated with the Saveset.

3. Retrieve the CDF.xml for the Clip and Identify the retention.period.

4. In this example, the converted value of the retention.period in the CDF is 7 years.
 

Conclusion:
The above scenario was referencing a Journal Archive that currently showed as being associated with a 5 Year Retention however items would not expire. The items in question were archived initially using a 7 year retention period, identified in the CDF. After this point, it appeared that the retention period for the Journal Archive was changed from using a 7 year period to a 5 year period.

This change is solely in SQL and did not alter items already stored using the initial 7 year category.  

Since the items were initially archived under a 7 year retention, Centera also assigned a 7 year retention to the physical Clip on Storage. These items would not expire until the original 7 year retention has expired.

 

 

Was this content helpful?