Potential data loss with Universal Share

Article: 100050621
Last Published: 2021-11-01
Ratings: 0 0
Product(s): Appliances, NetBackup & Alta Data Protection

Severity:

Possible Data Loss

Description:

A race condition with Universal Share under heavy load may potentially result in data loss. The issue only impacts data ingested through Universal Share and does not impact data ingested from NetBackup operations, including backup/opt-dup/replication.

Versions Affected:

NetBackup 8.1 and 8.1.2

NetBackup 8.2

NetBackup 8.3 and 8.3.0.1 and 8.3.0.2

NetBackup 9.0 and 9.0.0.1

NetBackup 9.1 and 9.1.0.1

Cause:

Under some load conditions, synchronization between metadata references and data leads to data loss.

Solution

With 8.1, upgrade to 8.1.2 and apply MSDP EEB Bundle ET 3956103 Version 24 or higher

With 8.2 apply MSDP EEB Bundle ET 3981133 Version 29 or higher

With 8.3, upgrade to 8.3.0.1 and apply MSDP EEB bundle ET 4013394 V28 or higher, or upgrade to 8.3.0.2 and apply MSDP EEB Bundle ET 4045322 V4 or higher

With 9.0, apply MSDP EEB Bundle ET 4021634 V5 or higher.

With 9.0.0.1 apply MSDP EEB bundle ET 4033971 V4 or higher

With 9.1, apply MSDP EEB bundle ET 4040970 V7 or higher

With 9.1.0.1, apply MSDP EEB bundle ET 4047040 V1 or higher

After applying the EEB, a user may follow the below example to check if data loss happens to a Universal Share and remove the identified files to bring the system back to a healthy state:

Example:

Find the share ID of vpfs shares /mnt/vpfs_shares/<share dir>/<share id> or old universal share path on appliance (created from the appliance web console) /shares/<share id>. Assume a system has a share as:

/mnt/vpfs_shares/d96b/d96b02c2-6763-5a01-8f8a-4e4ee59f920d

Try to run vpfsck for the share "d96b02c2-6763-5a01-8f8a-4e4ee59f920d" which potentially has data issue, the “--t 12” option means the vpfsck will not check files written within the last 12 hours to avoid false positives:

#/usr/openv/pdde/vpfs/bin/vpfsck --share_id d96b02c2-6763-5a01-8f8a-4e4ee59f920d --verify_data --t 12

Checking files for d96b02c2-6763-5a01-8f8a-4e4ee59f920d

[ERROR]: Cannot find dcid: 2188 file: /mnt/vpfs/d96b/d96b02c2-6763-5a01-8f8a-4e4ee59f920d/0/000/000/017.tst

[ERROR]: File information, path: /0/000/000/017.tst ec: 7 [ mtim: 1623274223 size: 1073741824 fg: 1 ext: /msdp/vol2/meta_dir/d96b/d96b02c2-6763-5a01-8f8a-4e4ee59f920d/map/0/000/000/017.tst extl: local ]

[ERROR]: File size in header and extent doesn't match, file: /mnt/vpfs/d96b/d96b02c2-6763-5a01-8f8a-4e4ee59f920d/0/000/000/018.tst header size: 1073741824 extent size: 0

[ERROR]: File information, path: /0/000/000/018.tst ec: 6 [ mtim: 1623274225 size: 1073741824 fg: 1 ext: /msdp/vol1/meta_dir/d96b/d96b02c2-6763-5a01-8f8a-4e4ee59f920d/map/0/000/000/018.tst extl: local ]

Check files for d96b02c2-6763-5a01-8f8a-4e4ee59f920d done!

The vpfsck checking found two files having issues. These two files need to be removed from this Universal Share:

#rm /mnt/vpfs/d96b/d96b02c2-6763-5a01-8f8a-4e4ee59f920d/0/000/000/017.tst
#rm /mnt/vpfs/d96b/d96b02c2-6763-5a01-8f8a-4e4ee59f920d/0/000/000/018.tst

References

Etrack : 3981133

Was this content helpful?