Tapes and Deduplication - Storage Lifecycle Polici...

Mauro_Gabriel_B · ‎02-14-2013

I wanted to share a data collection I did today, while working with IT Solutions, deciding the best approach on how we were going to continue sending our backups Offsite. We're running about 12000 jobs per weekend, 3000 of those going offsite on Tapes. The rest are sent to DataDomain.

Environment details:
Master Server: HPUX 11.31 running NBU 7.0.1
Media Server: RHEL 5.7 running 7.0.1
LTO4 Tape Drives - SL8500 ACSLS managed library.
DD990 DataDomain
Test SLP Duplicating from DataDomain to LTO4

After reading several pieces (*) from people facing the same questions, I've ran into a BackupExec thread, which speaks about rehydration of the images before being sent to tape.. so wanted to check numbers.

Logs & Data that show how deduplicated images are rehydrated in NetBackup, when replicating to tape:

=============================================================================

The deduplicated image I took as reference is the following:

Image = V6.5 I client1-nb_1360252262 SLPtest 3

1st copy goes to @aaaak, the DataDomain's mediaID of sec_dd01_lsu

2nd copy goes to LG1870, physical LTO4 on a SL8500, ACSLS

Foung on /usr/openv/netbackup/db/images Catalog - DataDomain 1st Copy, fragsize = 23833011 Kbytes = 22.7Gb

# FRAG: c# f# K rem mt den fn id/path host bs off md dwo f_flags f_unused1 exp mpx rl chkpt rsm_nbr seq_no media_subtype keep_date copy_date i_unused1
FRAGMENT 1 1 23833011 0 0 0 0 @aaaak nbumedia2-nb 524288 0 0 -1 0 1;DataDomain;datadomain01;sec_ost_datadomain01_pool1;sec_dd01_lsu;0 1361461862 0 65537 0 0 0 6 1361461862 1360253360 0

Found on NBDB EMM_ImageFragment - DataDomain 1st Copy, fragsize = 24405003264 bytes = 22.7Gb

'1000002','client1-nb_1360252262','1','1','0','@aaaak','1000184','0','0','6','1','24405003264','1', '@aaaak','2013-02-07 16:09:21.488512','2013-02-07 16:09:21.488521'

Found on DataDomain ddfs.info - Fragment File, before dedup: 24405003264 bytes = 22.7Gb, after dedup = 220603943 bytes = 0.2Gb

02/07 05:47:08.431 (tid 0x28945d0): nfsproc3_ddcp_close_file_3_svc: ddcp ctx 25: closed file /data/col1/sec_dd01_lsu/client1-nb_1360252262_C1_F1:1360252262:client1_app:4:1::: 24405003264 bytes, 556668312 post-identity (43.84x), 220603943 post-lc (2.52x), 220603943 total (110.63x), 21.48 MiB/s (180.21 Mbps) logical, 0.19 MiB/s (1.63 Mbps) physical

=> So what remains is to check the size of the duplicated copy to tape, to see if it was rehydrated from 0.2Gb to 22.7Gb :

Found on /usr/openv/netbackup/db/images Catalog - Tape Duplicated Copy - 10240000 + 10240000 + 3353011 K = 22.7Gb

# FRAG: c# f# K rem mt den fn id/path host bs off md dwo f_flags f_unused1 exp mpx rl chkpt rsm_nbr seq_no media_subtype keep_date copy_date i_unused1

FRAGMENT 2 1 10240000 0 2 20 3 LG1870 nbumedia2-nb 262144 79598 1360253344 0 0 *NULL* 1363881062 0 65538 0 0 0 1 1363881062 1360254067 0

FRAGMENT 2 2 10240000 0 2 20 4 LG1870 nbumedia2-nb 262144 119600 1360253344 0 0 *NULL* 0 0 0 0 0 0 1 0 0 0

FRAGMENT 2 3 3353011 0 2 20 5 LG1870 nbumedia2-nb 262144 159602 1360253344 0 0 *NULL* 0 0 0 0 0 0 1 0 0 0

Just to verify some more, here are both the copies on BPTM log:

DataDomain Copy:

10:51:14.126 [11456] <4> write_backup: begin writing backup id client1-nb_1360252262, copy 1, fragment 1, destination path sec_dd01_lsu

11:09:17.488 [11456] <4> write_backup: successfully wrote backup id client1-nb_1360252262, copy 1, fragment 1, 23833011 Kbytes at 22003.325 Kbytes/sec

Tape Copy:

11:15:50.453 [12119] <4> write_backup: begin writing backup id client1-nb_1360252262, copy 2, fragment 1, to media id LG1870 on drive sl5gl0129 (index 0)

11:17:39.378 [12119] <4> write_backup: begin writing backup id client1-nb_1360252262, copy 2, fragment 2, to media id LG1870 on drive sl5gl0129 (index 0)

11:20:01.438 [12119] <4> write_backup: begin writing backup id client1-nb_1360252262, copy 2, fragment 3, to media id LG1870 on drive sl5gl0129 (index 0)

11:17:39.367 [12119] <4> write_backup: successfully wrote backup id client1-nb_1360252262, copy 2, fragment 1, 10240000 Kbytes at 94660.553 Kbytes/sec

11:20:01.340 [12119] <4> write_backup: successfully wrote backup id client1-nb_1360252262, copy 2, fragment 2, 10240000 Kbytes at 72926.162 Kbytes/sec

11:21:06.999 [12119] <4> write_backup: successfully wrote backup id client1-nb_1360252262, copy 2, fragment 3, 3353011 Kbytes at 63524.449 Kbytes/sec

Logs conclude the deduplicated images are rehydrated when replicated from DataDomain to Tape, using SLPs.

- There is no benefit from the space reduction gained on deduplication, when replicating images to tape.

- There's an overhead on CPU when having to read and rehydrate from DataDomain.

(*) Related articles:

http://www.symantec.com/connect/forums/how-deduplication-works-tape

Ketank · ‎03-23-2013

HI,

If you want to send the dedupe back up to tape and keep it in the same format it is possible through Comvault.

It copys the dedupe data on tape without getting it back to the normal size.

This will be able to save lot of money any time for taking back up on tape.

Mauro_Gabriel_B · ‎03-25-2013

Thank you for the information Ketank,

A few of questions come from it.. first one I'd think about is, who'd store the metadata to rehydrate the image, is that being sent to the tape also or retained in Comvault ?

The second one would be for the Comvault devs, and that's how did they workaround the slow random access reading from tape impossibility? ( I thought it was possible to write deduped data to tape, but reading requires a 2 stage process ).

Well.. before assumming too much I'll definitely read about Comvault, and how it works... because the space reduction as you say, would be a major saving.

Thank you very much Ketank.

VOX

Tapes and Deduplication - Storage Lifecycle Policies