InfoScale™ 9.0 Replication Administrator's Guide - AIX
- Section I. Getting started with Volume Replicator
- Introducing Volume Replicator
- Understanding how Volume Replicator works
- How VVR processes application writes
- How VVR uses kernel buffers for replication
- How data flows in VVR asynchronous mode
- About secondary logging enabled asynchronous mode
- About bulk transfer with secondary logging
- How data flows in VVR asynchronous mode with secondary logging
- How data flows in VVR synchronous mode
- How data flows in an RDS containing multiple Secondary hosts
- Replication in a shared disk group environment
- Assigning a slave node as a logowner
- Understanding how VVR logs writes to the SRL
- Understanding Storage Checkpoints
- Volume sets in VVR
- Changing membership of an RVG and a volume set
- Using SmartTier with VVR
- Cross-platform Data Sharing in VVR
- Understanding the VVR snapshot feature
- About VVR compression
- Planning and configuring replication
- Introduction to planning and configuring replication
- Before you begin configuring
- Choosing the mode of volume replication
- Choosing latency and SRL protection
- Planning the network
- Sizing the SRL
- Choosing the type of DCM logging
- Best practices for setting up replication
- How the agents for hybrid applications work
- Pre-requisites to enable data over wire encryption
- Enabling Data Change Object (DCO) for SRL Volumes
- Understanding replication settings for a Secondary
- Configuring VVR in a VCS environment
- Overview of how to configure VVR in a VCS environment
- Using the primary-elect feature to choose the primary site after a site disaster or network disruption
- Requirements for configuring VVR in a VCS environment
- Generic VVR setup in a VCS environment
- Example VVR configuration in a VCS environment
- Example RVG configuration for a failover application
- Example RVG configuration for a parallel application
- Example setting up VVR in a VCS environment
- Configuring the agents for a bunker replication configuration
- Administering VCS service groups
- Section II. Setting up and administering VVR
- Setting up replication
- About configuring VVR replication
- Enabling security certificate
- Setting up the certificate authority (CA) certificates in /etc/vx/vvr/cacert.pem
- Creating a Replicated Data Set
- Creating a Primary RVG of an RDS
- Adding a Secondary to an RDS
- Changing the replication settings for a Secondary
- Synchronizing the Secondary and starting replication
- Starting replication when the data volumes are zero initialized
- Starting replication over the wire when data volumes are encrypted
- Setting up replication with different sector-sized disks
- Displaying configuration information
- Displaying RVG and RDS information
- Displaying information about data volumes and volume sets
- Displaying information about Secondaries
- Displaying a list of Storage Checkpoints
- Displaying statistics with the vrstat display commands
- Displaying the consolidated statistics
- Displaying the RLINK information for all the hosts in the RDS
- Displaying information about all the data volumes for all the hosts in the RDS
- Displaying information about the SRL volumes for all the hosts in the RDS
- Displaying information about the memory tunable parameters for all the hosts in the RDS
- Determining VVR network bandwidth usage and compression ratio
- Collecting consolidated statistics of the VVR components
- Displaying network performance data
- VVR event notification
- Administering Volume Replicator
- Administering data volumes
- Associating a volume to a Replicated Data Set
- Associating a volume set to an RDS
- Associating a Data Change Map to a data volume in DCO
- Associating a Data Change Map to a data volume as a log plex
- Resizing a data volume in a Replicated Data Set
- Renaming a data volume in a Replicated Data Set
- Dissociating a data volume from its Replicated Data Set
- Mapping the name of a Secondary data volume to a differently named Primary data volume
- Mapping disk groups
- Administering the SRL
- Protecting from SRL overflow
- Incrementally synchronizing the Secondary after SRL overflow
- SRL overflow protection with DCM - flags and definitions
- Prerequisite for incrementally synchronizing the Secondary
- Breaking off mirrors before incremental synchronization
- Example - Resynchronizing the Secondary using break off mirrors
- Example - Recreating volumes if a disaster occurs during resynchronization
- Notes on using incremental synchronization on SRL overflow
- Changing the size of the SRL on the Primary and the Secondary
- Decreasing the size of the SRL on the Primary
- Administering replication
- Administering the Replicated Data Set
- Administering Storage Checkpoints
- Creating RVG snapshots
- Using the instant snapshot feature
- About instant full snapshots
- Prerequisites for creating instant full snapshots
- Creating snapshot volumes for data volumes in an RVG
- Preparing the volumes prior to using the instant snapshot feature
- Freezing or pausing replication prior to taking a snapshot
- Creating instant full snapshots
- Unfreezing or resuming replication after taking a snapshot
- About instant space-optimized snapshots
- Preparing the RVG volumes for snapshot operation
- Creating the cache object for instant space-optimized snapshots
- Freezing or pausing replication prior to creating an instant space-optimized snapshot
- Creating instant space-optimized snapshots
- Unfreezing or resuming replication after taking an instant space-optimized snapshot
- About instant plex-breakoff snapshots
- Administering snapshots
- Using the traditional snapshot feature
- Using Veritas Volume Manager FastResync
- Verifying the DR readiness of a VVR setup
- Backing up the Secondary
- Administering data volumes
- Using VVR for off-host processing
- Transferring the Primary role
- About transferring the Primary role
- Migrating the Primary
- About taking over from an original Primary
- Failing back to the original Primary
- About choosing the Primary site after a site disaster or network disruption
- Application availability in the case of a network disruption
- Configuring VCS global clustering so you can choose the Primary site
- Choosing the Primary site after a site disaster or network disruption
- Troubleshooting the primary-elect feature
- Primary-elect configuration limitations
- Replication using a bunker site
- Introduction to replication using a bunker site
- Sample bunker configuration
- Setting up replication using a bunker site
- Administering replication using a bunker site
- Using a bunker for disaster recovery
- Replication using a bunker site in a VCS environment
- Removing a bunker
- About bunker commands
- Configuring and administering VVR using System Management Interface Tool
- About SMIT for VVR
- Accessing Volume Replicator interface in SMIT
- Setting up a simple Volume Replicator configuration using SMIT
- Displaying configuration information using SMIT
- Administering Volume Replicator using SMIT
- Pausing replication to a Secondary using SMIT
- Resuming replication to a Secondary using SMIT
- Resynchronizing a Secondary using SMIT
- Taking traditional snapshot of data volumes of an RVG using SMIT
- Taking instant snapshot of data volumes of an RVG using SMIT
- Snapback of data volumes of an RVG using SMIT
- Associating a volume to a Replicated Data Set using SMIT
- Dissociating a data volume from a Replicated Data Set using SMIT
- Removing a Secondary from a Replicated Data Set using SMIT
- Removing a Primary from an RDS using SMIT
- Stopping replication to a Secondary using SMIT
- Transferring the Primary role using SMIT
- Troubleshooting VVR
- Recovery from RLINK connect problems
- Recovery from configuration errors
- Errors during an RLINK attach
- Errors during modification of an RVG
- Recovery on the Primary or Secondary
- About recovery from a Primary-host crash
- Recovering from Primary data volume error
- Primary SRL volume error cleanup and restart
- Primary SRL volume error at reboot
- Primary SRL volume overflow recovery
- Primary SRL header error cleanup and recovery
- Secondary data volume error cleanup and recovery
- Secondary SRL volume error cleanup and recovery
- Secondary SRL header error cleanup and recovery
- Secondary SRL header error at reboot
- Tuning replication performance
- Overview of replication tuning
- SRL layout
- Tuning Volume Replicator
- VVR buffer space
- Write buffer space on the Primary
- Readback buffer space on the Primary
- Buffer space on the Secondary
- Tunable parameters for the VVR buffer spaces
- Tunable parameters for the write buffer space on the Primary in a private disk group
- Tunable parameter for the readback buffer space
- Tunable parameters for the buffer space on the Primary in a shared disk group
- Tunable parameters for the buffer space on the Secondary
- DCM replay block size
- Heartbeat timeout
- Memory chunk size
- UDP replication tuning
- Tuning the number of TCP connections
- Message slots on the Secondary
- VVR and network address translation firewall
- Tuning VVR compression
- VVR buffer space
- Setting up replication
- Section III. Analyzing your environment with Volume Replicator Advisor
- Introducing Volume Replicator Advisor (VRAdvisor)
- Collecting the sample of data
- About collecting the sample of data
- Collecting the sample of data on UNIX
- Collecting the sample of data on Windows
- Analyzing the sample of data
- About analyzing the sample of data
- Launching the VRAdvisor wizard
- Analyzing the collected data
- Understanding the results of the analysis
- Viewing the analysis results
- Recalculating the analysis results
- Applying different parameters to the existing sample of data
- Performing What-if analysis
- Calculating the SRL Size for a specified Network Bandwidth and Outage
- Calculating the Network Bandwidth for data loss specified in bytes
- Calculating the Network Bandwidth for data loss specified in time duration
- Calculating the Network Bandwidth for Bunker and RTO
- Changing the value ranges on the slider bar
- Recording and viewing the results
- Installing Volume Replicator Advisor (VRAdvisor)
- Section IV. VVR reference
- Appendix A. VVR command reference
- Appendix B. Using the In-band Control Messaging utility vxibc and the IBC programming API
- About the IBC messaging utility vxibc
- In-band Control Messaging overview
- Using the IBC messaging command-line utility
- Registering an application name
- Displaying the registered application name for an RVG
- Receiving an IBC message
- Sending an IBC message
- Unfreezing the Secondary RVG
- Unregistering an application name
- Receiving and processing an IBC message using a single command
- Sending and processing an IBC message using a single command
- Examples - Off-host processing
- Example 1 - Decision support using the traditional snapshot feature and the vxibc utility
- Example 2 - Backing up using the snapshot feature and the vxibc utility
- Example 3 - Trial failover using the snapshot feature
- Example 4 - Decision support using the instant full snapshot feature and the vxibc utility
- In-band Control Messaging API
- Appendix C. Volume Replicator object states
- Appendix D. Alternate methods for synchronizing the Secondary
- Using the full synchronization feature
- Using block-level backup and Storage Checkpoint
- Using the Disk Group Split and Join feature
- Using difference-based synchronization
- Examples for setting up a simple Volume Replicator configuration
- Creating a Replicated Data Set for the examples
- Example for setting up replication using full synchronization
- Example for setting up replication using block-level backup and checkpointing
- Example for setting up replication using Disk Group Split and Join
- Example for setting up replication using differences-based synchronization
- Example for setting up replication when data volumes are initialized with zeroes
- Appendix E. Migrating VVR from IPv4 to IPv6
- Migrating VVR to support IPv6 or dual stack
- Overview of VVR migration from IPv4 to IPv6
- About migrating to IPv6 when VCS global clustering and VVR agents are not configured
- About migrating to IPv6 when VCS global clustering and VVR agents are configured
- Understanding the current IPv4 configuration when VCS global clustering and VVR agents are configured
- Migration prerequisites when VCS global clustering and VVR agents are configured
- Migrating to IPv6 when VCS global clustering and VVR agents are configured
- Migrating the VCS global clustering service group to IPv6 when VCS global clustering and VVR agents are configured
- Adding IP and NIC resources for IPv6 addresses in the RVG agent group when VCS global clustering and VVR agents are configured
- Migrating VVR RLINKs from IPv4 to IPv6 when VCS global clustering and VVR agents are configured
- Removing the IPv4 resources from the VCS configuration when VCS global clustering and VVR agents are configured
- About migrating to IPv6 when VCS global clustering and VVR agents are configured in the presence of a bunker
- Understanding the current IPv4 configuration when VCS global clustering and VVR agents are configured in the presence of a bunker
- Migration prerequisites when VCS global clustering and VVR agents are configured in the presence of a bunker
- Migrating to IPv6 when VCS global clustering and VVR agents are configured in the presence of a bunker
- Migrating the VCS global clustering service group to IPv6 when VCS global clustering and VVR agents are configured in the presence of a bunker
- Adding the IP and NIC resources for IPv6 addresses in the RVG agent group when VCS global clustering and VVR agents are configured in the presence of a bunker
- Migrating VVR RLINKs from IPv4 to IPv6 when VCS global clustering and VVR agents are configured in the presence of a bunker
- Removing the IPv4 resources from the VCS configuration when VCS global clustering and VVR agents are configured in the presence of a bunker
- Appendix F. Sample main.cf files
Tunable parameters for the buffer space on the Secondary
The amount of buffer space available for requests coming in to the Secondary over the network is determined by the VVR tunable, vol_max_nmpool_sz, which defaults to 64 megabytes. VVR allocates separate buffer space for each Secondary RVG, the size of which is equal to the value of the tunable vol_max_nmpool_sz. The buffer space on the Secondary must be large enough to prevent slowing the network transfers excessively.
If the buffer is too large, it can cause problems. When a write arrives at the Secondary, the Secondary sends an acknowledgment to the Primary so that the Primary knows the transfer is complete. When the write is written to the data volume on the Secondary, the Secondary sends another acknowledgment, which tells the Primary that the write can be discarded from the SRL. However, if this second acknowledgment is not sent within one minute, the Primary disconnects the RLINK. The RLINK reconnects immediately but this causes disruption of the network flow and potentially other problems. Thus, the buffer space on the Secondary should be sized in such a way that no write can remain in it for one minute. This size depends on the rate at which the data can be written to the disks, which is dependent on the disks themselves, the I/O buses, the load on the system, and the nature of the writes (random or sequential, small or large).
If the write rate is W megabytes/second, the size of the buffer should be no greater than W * 50 megabytes, that is, 50 seconds' worth of writes.
There are various ways to measure W. If the disks and volume layouts on the Secondary are comparable to those on the Primary and you have I/O statistics from the Primary before replication was implemented, these statistics can serve to arrive at the maximum write rate.
Alternatively, if replication has already been implemented, start by sizing the buffer space on the Secondary to be large enough to avoid timeout and memory errors.
While replication is active at the peak rate, run the following command and make sure there are no memory errors and the number of timeout errors is small:
# vxrlink -g diskgroup -i5 stats rlink_name
Then, run the vxstat command to get the lowest write rate:
# vxstat -g diskgroup -i5
The output looks similar to this:
OPERATIONS BLOCKS AVG TIME(ms) TYP NAME READ WRITE READ WRITE READ WRITE Mon 29 Sep 2003 07:33:07 AM PDT vol srl1 0 1245 0 1663 0.0 9.0 vol archive 0 750 0 750 0.0 9.0 vol archive-L01 0 384 0 384 0.0 5.9 vol archive-L02 0 366 0 366 0.0 12.1 vol ora02 0 450 0 900 0.0 11.1 vol ora03 0 0 0 0 0.0 0.0 vol ora04 0 0 0 0 0.0 0.0 Mon 29 Sep 2003 07:33:12 AM PDT vol srl1 0 991 0 1389 0.0 20.1 vol archive 0 495 0 495 0.0 10.1 vol archive-L01 0 256 0 256 0.0 5.9 vol archive-L02 0 239 0 239 0.0 14.4 vol ora02 0 494 0 988 0.0 10.0 vol ora03 0 0 0 0 0.0 0.0 vol ora04 0 0 0 0 0.0 0.0
For each interval, add the numbers in the blocks written column for data volumes, but do not include the SRL. Also, do not include any subvolumes. For example, archive-L01, and archive-L02 are subvolumes of the volume archive. The statistics of the writes to the subvolumes are included in the statistics for the volume archive. You may vary the interval, the total time you run the test, and the number of times you run the test according to your needs. In this example, the interval is 5 seconds and the count is in blocks, hence on a machine with 2 kilobytes of block size, the number of megabytes per interval, M, is (total * 2048)/(1024*1024), where total is the sum for one interval. Hence, for one second the number of megabytes is M/5 and the size of the buffer is (M/5)*50. If there is more than one Primary, do not increase the buffer size beyond this number.
The writes to the SRL should not be considered part of the I/O load of the application. However, in asynchronous mode, the Secondary writes the incoming updates to both the Secondary SRL and the data volumes, so it may be necessary to make the value of vol_max_nmpool_sz slightly larger. However, to avoid the problems discussed at the beginning of this section, the calculated vol_max_nmpool_sz value should still ensure that writes do not remain in the pool for more than one minute.