The following are best practices when using the FileStore deduplication feature:
Deduplication is most effective when the file system block size and the deduplication block size are the same for file systems with block sizes of 4K and above. This also allows the deduplication process to estimate space savings more accurately.
The smaller the file system block size and the deduplication block size, the higher is the time required for performing deduplication. Smaller block sizes, for example, 1K and 2K, increase the number of data fingerprints that the deduplication database has to store.
Data archive and retention (DAR) file systems may be good candidates for deduplication depending on the workload. Deduplication is more effective if deduplication is turned off in Symantec Enterprise Vault (EV) prior to running FileStore deduplication.
Evaluation of changes in the file system is done by the file system's File Change Log (FCL). Setting the frequency on a too infrequent basis may cause the FCL to rollover, thereby missing changes and deduplication opportunities to the file system.
After enabling deduplication on file systems with existing data, the first deduplication run does a full deduplication. This can be time-consuming, and may take 12 to 15 hours per TB, so plan accordingly.
The deduplication database takes up 1% to 7% of logical file system data. In addition, during deduplication processing, an additional but temporary storage space is required. Though 15% free space is enforced, it is recommended to have 30% free space when the deduplication block size is less than 4096 (4K) bytes.