Stripe set performance considerations for Veritas Storage Foundation

Problem

Stripe set performance considerations for Veritas Storage Foundation

Solution

 

This article is a part of a set on troubleshooting volume performance. Click here to start at the beginning: http://www.symantec.com/docs/TECH202712

 

Table of Contents


Introduction
Using vxtrace to determine I/O characteristics
Sequential I/O
Random I/O
Determining the current stripe unit size
Matching the stripe size to the file system allocation unit size




Introduction

(Back to top)


By striping data across multiple spindles (physical disks) I/O can be processed in a parallel manner, increasing peformance. However, the traditional advantages of software-based stripe-sets are sometimes outweighed by changes and improvements to modern storage hardware. Today, disk arrays typically provide their own hardware-based striping which should be taken into consideration to avoid implmenting multiple RAID implementations that may conflict with each other. Different applications, such as databases or file servers, have dissimilar I/O characteristics that are affected by striping in varying ways.

In theory, as more spindles are added to a stripe set, more I/O is processed in parallel, potentially improving performance. However, the increase in parallel processing must be weighed against the increasing amount of movement that is the result of fragmenting I/O across multiple columns. As columns are added, one eventually encounters a "diminishing return" where adding further columns no longer provides a significant improvement in I/O, or is not worth the increased risk of a hardware failure. Every spindle that is added to a stripe set increases the chance that a single hardware failure will cause the entire volume to fail.
 


Note: Do not assume that a larger number of columns will provide better performance than a smaller number, or that a certain stripe unit size will have superior performance when compared to a different stripe unit size, or even that a striped volume will actually have superior performance when compared to a concatenated volume.

There are too many variables involved in performance for such assumptions to be true for all cases and there is no substitute for testing. Before putting a volume into production, use benchmarking tools to test I/O performance, in different layouts, in a manner that is representative of the intended production environment. This is the only reliable method to determine which layout provides the best performance.


 


 

Using vxtrace to determine I/O characteristics

(Back to top)


Vxtrace can be used to analyze the characteristics of I/O that is being written to a volume (Figure 1). This is useful for distinguishing random I/O from sequential I/O, the typical length (in sectors) of each I/O transaction, and how the I/O is being fragmented across multiple columns. The optimal stripe unit size ultimately depends on the characteristics of the I/O that is generated by the application.

Finding the typical I/O length is important for determining an appropriate stripe unit size.

  • I/O lengths that are larger than the stripe width will be broken across multiple columns
  • I/O lengths that are smaller than, or equal to, the stripe unit size will completely "fit" into one of the columns and not use any of the others.

Note: The vxtrace excerpts in this article are very brief to improve readability. Reviewing a larger sample is recommended in order to include data that is representative of the production environment.


 


Figure 1 - Using vxtrace to gather information about I/O to a volume


Syntax:

vxtrace -t <time_in_seconds> -g <diskgroup> -o dev,disk <volume> > <outputfile>


Example, with typical output:

# vxtrace -t 10 -g datadg -o dev,disk engvol > /tmp/vxtrace.engvol
# tail /tmp/vxtrace.engvol
6432 START write disk disk_3 op 6430 block 392248 len 128
6433 START write vdev engvol block 326584 len 128 concurrency 1 pid 32331
6434 START write disk disk_6 op 6433 block 494776 len 128
6435 START write disk disk_3 op 6433 block 392376 len 128
6436 START write vdev engvol block 326712 len 128 concurrency 2 pid 32331
6437 START write disk disk_6 op 6436 block 494904 len 128
6438 START write disk disk_3 op 6436 block 392504 len 128
6439 START write vdev engvol block 326840 len 128 concurrency 3 pid 32331
6440 START write disk disk_6 op 6439 block 495032 len 128
6441 START write disk disk_3 op 6439 block 392632 len 128





Sequential I/O

(Back to top)


Figures 2 shows an example of sequential I/O, as observed by vxtrace. Notice that the starting block for each I/O appears to increment slightly from the previous operation. Also notice that the I/O length is usually 384 sectors.

For sequential I/O, optimal performance is generally achieved if I/O transactions are more frequently spread across multiple columns. This can be accomplished by using a stripe width size that is smaller than the typical I/O length.


Note: Do not confuse "stripe unit size" with "stripe width." Stripe width refers to the product of the stripe unit size multiplied by the number of columns. For example, a volume that has 3 columns and a stripe unit size of 128 has a stripe width of 384.

 


Figure 2 - An example of vxtrace output showing sequential I/O

53595 START write vdev vol1 block 5785984 len 384 concurrency 1 pid 5855
53596 START write disk disk_5 op 53598 block 1994368 len 128
53597 START write disk disk_3 op 53598 block 1994496 len 128
53598 START write disk disk_4 op 53598 block 1994496 len 128
53595 END write vdev vol1 block 5785984 len 384
53596 END write disk disk_5 op 53598 block 1994368 len 128
53597 END write disk disk_3 op 53598 block 1994496 len 128
53598 END write disk disk_4 op 53598 block 1994496 len 128
53603 START write vdev vol1 block 5786752 len 384 concurrency 1 pid 5855
53604 START write disk disk_5 op 53606 block 1994624 len 128
53605 START write disk disk_3 op 53606 block 1994752 len 128
53606 START write disk disk_4 op 53606 block 1994752 len 128
53603 END write vdev vol1 block 5786368 len 384
53604 END write disk disk_5 op 53602 block 1994496 len 128
53605 END write disk disk_3 op 53602 block 1994624 len 128
53606 END write disk disk_4 op 53602 block 1994624 len 128
53611 START write vdev vol1 block 5786752 len 384 concurrency 1 pid 5855
53612 START write disk disk_5 op 53606 block 1994624 len 128
53613 START write disk disk_3 op 53606 block 1994752 len 128
53614 START write disk disk_4 op 53606 block 1994752 len 128
53615 START write vdev vol1 block 5787136 len 64 concurrency 2 pid 5855
53616 START write disk disk_5 op 53610 block 1994752 len 64

 



Random I/O

(Back to top)


Figure 3 shows an example of random I/O. Notice that the starting block varies significantly. The I/O lengths also vary in this sample, but tend to be lower than those in Figure 2.
 
For random I/O, optimal performance is generally achieved by containing each I/O transaction into a single column. To accomplish this, the stripe unit size should be larger than the average I/O size.


Figure 3 - An example of vxtrace output showing random I/O

43024 START write vdev vol1 block 33778 len 94 concurrency 1 pid 2202
43025 START write disk disk_5 op 43024 block 77042 len 14
43026 START write disk disk_3 op 43024 block 77056 len 80
43025 END write disk disk_5 op 43024 block 77042 len 14 time 3
43026 END write disk disk_3 op 43024 block 77056 len 80 time 3
43024 END write vdev vol1 op 43024 block 33778 len 94 time 3
43027 START write vdev vol1 block 1104 len 1 concurrency 1 pid 2203
43028 START write disk disk_5 op 43027 block 66128 len 1
43028 END write disk disk_5 op 43027 block 66128 len 1 time 2
43027 END write vdev vol1 op 43027 block 1104 len 1 time 2
43028 START write vdev vol1 block 1631 len 59 concurrency 1 pid 2202
43029 START write disk disk_3 op 43037 block 66399 len 33
43030 START write disk disk_4 op 43037 block 66304 len 26
43029 END write disk disk_3 op 43037 block 66399 len 33 time 3
43030 END write disk disk_4 op 43037 block 66304 len 26 time 3
43028 END write vdev vol1 op 43037 block 1631 len 59 time 3
43040 START write vdev vol1 block 36080 len 16 concurrency 1 pid 2203




Determining the current stripe unit size

(Back to top)


Use vxprint to determine the current stripe unit size (Figure 4).

Figure 4 shows volume "mgmtvol" with the following characteristics:

  • 3 columns
  • stripe unit size of 128 sectors (64KB)
  • stripe width size of 384 (the stripe unit size multiplied by the number of columns)


Figure 4


Syntax:

vxprint -htv <volume>


Example, with typical output:

# vxprint -htv mgmtvol

Disk group: datadg

V  NAME         RVG/VSET/CO  KSTATE   STATE    LENGTH   READPOL   PREFPLEX UTYPE
PL NAME         VOLUME       KSTATE   STATE    LENGTH   LAYOUT    NCOL/WID MODE
SD NAME         PLEX         DISK     DISKOFFS LENGTH   [COL/]OFF DEVICE   MODE
SV NAME         PLEX         VOLNAME  NVOLLAYR LENGTH   [COL/]OFF AM/NM    MODE
SC NAME         PLEX         CACHE    DISKOFFS LENGTH   [COL/]OFF DEVICE   MODE
DC NAME         PARENTVOL    LOGVOL
SP NAME         SNAPVOL      DCO
EX NAME         ASSOC        VC                       PERMS    MODE     STATE

v  mgmtvol      -            ENABLED  ACTIVE   102400   SELECT    mgmtvol-01 fsgen
pl mgmtvol-01   mgmtvol      ENABLED  ACTIVE   102528   STRIPE    3/128    RW
sd datadg02-01  mgmtvol-01   datadg02 0        34176    0/0       disk_4   ENA
sd datadg01-03  mgmtvol-01   datadg01 921600   34176    1/0       disk_3   ENA
sd datadg04-03  mgmtvol-01   datadg04 921600   34176    2/0       disk_6   ENA

 


Matching the stripe size to the file system allocation unit size

(Back to top)


A best practice is to set the stripe width to a multiple of the filesystem allocation unit size, For example, if the filesystem block size is 4KB, a stripe width of 384 would be a valid multiple because the quotient of 384 and 4 is an integer. Recall that the stripe width is the product of the stripe unit size multiplied by the number of columns.

Use fstyp to determine the filesystem block size (Figure 5).


Figure 5


Syntax:

fstyp -t|F vxfs -v <path_to_volume>


Example, with typical output:

# fstyp -t vxfs -v /dev/vx/rdsk/datadg/mgmtvol

vxfs
magic a501fcf5  version 9  ctime Wed 10 Apr 2013 11:37:59 AM PDT
logstart 0  logend 0
bsize  4096 size  12800 dsize  12800  ninode 0  nau 0
defiextsize 0  ilbsize 0  immedlen 96  ndaddr 10
aufirst 0  emap 0  imap 0  iextop 0  istart 0
bstart 0  femap 0  fimap 0  fiextop 0  fistart 0  fbstart 0
nindir 2048  aulen 32768  auimlen 0  auemlen 2
auilen 0  aupad 0  aublocks 32768  maxtier 15
inopb 16  inopau 0  ndiripau 0  iaddrlen 2   bshift 12
inoshift 4  bmask fffff000  boffmask fff  checksum f66f5f0c
oltext1 14  oltext2 1030  oltsize 1  checksum2 0
free 11993  ifree 0
efree  1 0 0 1 1 2 2 0 2 2 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

 



 


Terms of use for this information are found in Legal Notices.

Search

Survey

Did this article answer your question or resolve your issue?

No
Yes

Did this article save you the trouble of contacting technical support?

No
Yes

How can we make this article more helpful?

Email Address (Optional)