The dalloc (Delayed Allocation) feature of VxFS (Veritas File System) can cause silent data corruption
The dalloc (Delayed Allocation) feature of VxFS (Veritas File System) can cause silent data corruption.
VxFS 6.x includes a new feature called 'delayed allocation,' also known as dalloc. This feature is enabled, by default, for locally-mounted file systems. It is not supported for cluster-mounted file systems.
Asynchronous, sequential writes, that extend the size of a file, will create dirty memory pages. New extents can be allocated when the dirty pages are flushed to disk, via background processing, rather than allocating the extents in the same context as the write I/O. The dalloc feature therefore delays the extent allocation until the dirty pages are flushed to disk. The dalloc feature allows VxFS to write to a file even when the allocation has not happened.
When the dalloc feature is enabled on VxFS, a missed write could occur when certain part of the in-memory data fails to reach the disk, due to some internal miscalculation. This could lead to silent data corruption by a lost write.
On Solaris, there are no visible symptoms to trace the corruption, unless there are backups of the files, which can be used for comparison.
On AIX, and Linux, after a file is generated, through asynchronous, sequential write operations, the file appears to be intact if it is read immediately after it is created. However, if the file is read again after a significant time gap, part of the file will be replaced with zeros.
This is because an asynchronous write is a buffered write. The written data is buffered in the kernel page cache memory, so the reads are correct until the page cache is reused. The buffered data is replaced with something else, and the original data is lost. The file in such a case has a hole in place of the lost data. Any application reading this data receives zeros.
This happens on Storage Foundation, with VxFS 184.108.40.206, 6.1, 6.2 and subsequent hot fixes, where dalloc is enabled.
Cluster File Systems (CFS) are not at risk.
- All platforms
- Local mounts only. Cluster File Systems (CFS) are not at risk.
- Veritas File System (VxFS)
- 6.1.1, 6.2
- 220.127.116.11 (recalled)
- subsequent hotfixes
How to determine if the issue is present
1. When the problem occurs, the extent map (space allocation map) of the file will show a hole in it. The extent map of the file can be checked with fsmap.
/opt/VRTS/bin/fsmap -a <file>
# /opt/VRTS/bin/fsmap -a /avol01/datafile Volume Extent Type File Offset Extent Size File.... avol01 Data 1310720000 8626176 /avol01/datafile avol01 Data 1319346176 24928256 /avol01/datafile - <Hole> 1344274432 131072 /avol01/datafile avol01 Data 1344405504 8626176 /avol01/datafile avol01 Data 1353031680 24928256 /avol01/datafile
2. Using the file system debugger fsdb.
First, get the inode number of the file, using the command ls -li.
# ls -li /avol01/datafile2214 -rw-r--r--. 1 root root 1377959936 Aug 23 16:18 /avol01/datafile
Check the extent map with fsdb.
# echo '2214i.mapall' | /opt/VRTS/bin/fsdb /dev/vx/rdsk/adg/avol01offset device block length plength.... 1310720000 0 24344 8424 8424 1319346176 0 32768 24344 24344 1344274432 - HOLE 128 1344405504 0 57112 8424 8424 1353031680 0 65536 24344 24344
When an application reads the part of the file that corresponds to the hole in the extent map, VxFS will returns zeros for that part of the file. This is by design.
Due to a defect, some in-memory data is not flushed to the disk, when the dalloc feature is used
The problem is fixed in the following patch and hotfix releases.
|Platform||Affected VxFS Versions||Required Public GA VxFS patches||Private Hot-fixes|
|6.1||18.104.22.1680 (to be released)||22.214.171.124|
|6.1||126.96.36.1990 (to be released)||188.8.131.52|
|Linux||184.108.40.206||220.127.116.110||18.104.22.168 (RHEL6 only)|
|6.1||22.214.171.1240 (to be released)||6.1.1.009|
|6.1||126.96.36.1990 (to be released)
6.2 (fix is provided in VxFS 6.2 on AIX)
The public patches can be downloaded from the Veritas Operation Readiness Tools (SORT) website.
Note: The Veritas Storage Foundation High Availability (SFHA) patch version can be different from Veritas File System (VRTSvxfs) version. Sometimes, within the SFHA patch set, the actual VxFS patch version is lower than the SFHA version. For example, in the SFHA patch version 188.8.131.520, for the RHEL 6.7 platform, the VxFS patch version is actually 184.108.40.206. Please check the actual version of the included VxFS patch if a SFHA patch set is installed.
Warning: Disabling the dalloc feature may impact the performance of asynchronous, buffered writes that extend the size of a file. The extent allocations will be done in the same context of the write I/O (pre-VxFS 6.0 behavior) rather than delaying it until the dirty pages are flushed to disk.
- If a VxFS file system is resized, or re-mounted (using mount -o remount) after dalloc is disabled, the resize/remount operation may re-enable dalloc. The dalloc feature will have to be manually disabled again with vxtunefs.
- Care must be taken in a Oracle Cluster Ready Services (CRS) environment. Using vxtunefs to make a change to the CRS volume could cause a delay in the cluster heartbeat. First, disable CRS monitoring while make a vxtunefs change to /crs.
# vxtunefs -s -o dalloc_enable=0 $MOUNT_POINT
# vxtunefs -s -o dalloc_enable=0 /testmnt1
2. To make the value persistent across a system reboot, add an entry to the /etc/vx/tunefstab file. This file contains tuning parameters for VxFS. They are set automatically during the file system mount.
To set per file system
# cat /etc/vx/tunefstab/dev/vx/dsk/testdg/testvol1 dalloc_enable=0
To set as the system default
# cat /etc/vx/tunefstabsystem_default dalloc_enable=0
Detailed explanation of the above workaround
To make the setting persistent, update the /etc/vx/tunefstab file. This file contains tuning parameters for Veritas File Systems which are set automatically during filesystem mount.
Each entry in tunefstab is a line of fields in one of the following formats:
block_device tunefs options
system_default tunefs options
block_device is the name of the device on which the file system is mounted. If there is more than one line that specifies options for a device, each line is processed, and the options are set in order.
In place of block_device, system_default specifies tunables for each device to process. If an entry for both block_device and system_default exists, then the device value takes precedence.
Steps for making the VxFS File system tuning values persistent, per file system:
1. For the file system already mounted:
# vxtunefs -s -o dalloc_enable=0 /$MOUNT_POINT
# vxtunefs -s -o dalloc_enable=0 /testmnt1
2. Create the /etc/vx/tunefstab file, if it does not exist, and add the entry as shown in below example:
3. To confirm the value after mounting.
# vxtunefs /testmnt1 | grep "dalloc_enable"dalloc_enable = 0
Example using a single file system:
# vxtunefs /testmnt1 |grep "dalloc_enable"dalloc_enable = 1# vxtunefs -s -o dalloc_enable=0 /testmnt1UX:vxfs vxtunefs: INFO: V-3-22525: Parameters successfully set for /testmnt1# vxtunefs /testmnt1 |grep "dalloc_enable"dalloc_enable = 0
To make it persistent, add an entry to /etc/vx/tunefstab.
# cat /etc/vx/tunefstab/dev/vx/dsk/testdg/testvol1 dalloc_enable=0# mount -F vxfs /dev/vx/dsk/testdg/testvol1 /testmnt1/# vxtunefs /testmnt1/ | grep "dalloc_enable"dalloc_enable = 0
Example using multiple file systems:
# mount -F vxfs /dev/vx/dsk/testdg/testvol1 /testmnt1/# vxtunefs /testmnt1/ | grep "dalloc_enable"dalloc_enable = 1# mount -F vxfs /dev/vx/dsk/testdg/testvol2 /testmnt2# vxtunefs /testmnt2/ | grep "dalloc_enable"dalloc_enable = 1# cat /etc/vx/tunefstab/dev/vx/dsk/testdg/testvol1 dalloc_enable=0/dev/vx/dsk/testdg/testvol2 dalloc_enable=0# umount /testmnt1/# umount /testmnt2/# mount -F vxfs /dev/vx/dsk/testdg/testvol1 /testmnt1/# mount -F vxfs /dev/vx/dsk/testdg/testvol2 /testmnt2/# vxtunefs /testmnt1/ | grep "dalloc_enable"dalloc_enable = 0# vxtunefs /testmnt2/ | grep "dalloc_enable"dalloc_enable = 0
Steps for making the VxFS File system tuning values persistent for all file systems (system wide default setting )
This is helpful if you need to apply the tuning system-wide. If the system_default is specified instead of block_device, the tunable setting will be applied to all of the block devices when mounting the file systems.
1. Create the /etc/vx/tunefstab if it is not there already.
# touch /etc/vx/tunefstab# ls -l /etc/vx/tunefstab -rw-r--r-- 1 root system 75 Aug 21 17:11 /etc/vx/tunefstab
2. Add the entry to the /etc/vx/tunefstab file as shown in below example:
3. This can be confirmed, after mounting the file system on the device, using the vxtunefs –p <mntpt> command.
# vxtunefs /testmnt1/ | grep "dalloc_enable"dalloc_enable = 0
# cat /etc/vx/tunefstabsystem_default dalloc_enable=0# mount -F vxfs /dev/vx/dsk/testdg/testvol1 /testmnt1/# mount -F vxfs /dev/vx/dsk/testdg/testvol2 /testmnt2/# vxtunefs /testmnt1/ | grep "dalloc_enable"dalloc_enable = 0# vxtunefs /testmnt2/ | grep "dalloc_enable"dalloc_enable = 0
Note: The system_default setting is overridden if a block_device is also explicitly specified.
For example, if we want to change the system-wide default for dalloc_enable to 0, but do not want to set it for /dev/vx/dsk/testdg/testvol3, then we can achieve this as follows:
# cat /etc/vx/tunefstabsystem_default dalloc_enable=0/dev/vx/dsk/testdg/testvol3 dalloc_enable=1# mount -F vxfs /dev/vx/dsk/testdg/testvol1 /testmnt1/# mount -F vxfs /dev/vx/dsk/testdg/testvol2 /testmnt2/# mount -F vxfs /dev/vx/dsk/testdg/testvol3 /testmnt3/# vxtunefs /testmnt1/ | grep "dalloc_enable"dalloc_enable = 0# vxtunefs /testmnt2/ | grep "dalloc_enable"dalloc_enable = 0# vxtunefs /testmnt3/ | grep "dalloc_enable"dalloc_enable = 1
Was this content helpful?
Rating submitted. Please provide additional feedback (optional):