728x90
Problem(Abstract)
You run an ontape backup but it runs too slowly.
Cause
- TAPEBLK is set too low in the onCONFIG to take full advantage of the device's I/O throughput potential.
- The actual I/O speed is lower than the expected I/O speed of the devices used.
- Not enough resources available on computer like CPU, memory, and so on. The system might be busy.
Diagnosing the problem
- Run a timed test to determine approximate speed of I/O from a chunk to the backup device . Run the dd command with a block size (bs) equal to the block size value for TAPEBLK in the onCONFIG. Try to use a chunk that is at least 2 GB.
Note: This test might not work on a running instance because of an OS lock on the chunk that will not allow the dd command to run. If you cannot run the dd command on an actual chunk then you must use or create a file which resides in the same location as the chunks ( the same directory, same disk, same NFS mount, same SAN, and so on ). This is critical for this test to yield accurate timings.
Here is the command:
timex dd if=/full/path/to/informix/chunk of=/full/path/to/backup_directory/timetest.out bs=128k
Here is an actual example:
timex dd if=/informix/chunks/rootdbs of=/informix/backups/timetest.out bs=256k
586+1 records in
586+1 records out
real 7.76
user 0.00
sys 0.48
This dd test shows the speed for this backup would be about 18.9 MB/second or 66.4 GB/hour. This would be the approximate speed of ontape I/O with some consideration for ontape overhead.
- Capture and review the ontape function stacks to find out exactly what ontape is doing. Soon after you start the ontape backup you should see these 3 threads listed in the onstat -g ath output:
ontape
arcbackup1
arcbackup2
The output will be similar to this:
...
213 41dac918 40347be0 1 cond wait netnorm 1cpu ontape
214 41980a28 40348c50 1 IO Wait 1cpu arcbackup1
215 41980c88 40344a90 2 sleeping secs: 1 1cpu arcbackup2
To get the stack trace from each thread you will need to 3 separate commands. The first column of the onstat -g ath output is the thread id. Using the example output from above you would run these commands to continuously capture stack information for each of the 3 threads every 5 seconds:
onstat -g stk 213 -r > ontape.out
onstat -g stk 214 -r > arcbackup1.out
onstat -g stk 215 -r > arcbackup2.out
You can run these in the background or in 3 separate windows. Stop them when the backup is complete.
Resolving the problem
- Increase the value of the onCONFIG configuration parameter TAPEBLK. If using a tape device, then use the device's suggested maximum block size. If writing the backup to disk, then continue increasing TAPEBLK on subsequent backups until you reach the point where increasing TAPEBLK does not decrease time to complete the backup ( you could start at 128 then go to 256, 512, 1024, and so on ).
- Use faster backup device or find ways to increase I/O of that device. Some examples:
- use a faster tape drive or a tape that can handle higher I/O speeds
- backup to local disk then ftp the backup to a SAN
- Collect ontape thread stack traces and analyze or send to technical support for analysis.
728x90
'Informix > informix troubleshooting' 카테고리의 다른 글
how to terminate global transactions? (0) | 2013.02.13 |
---|---|
Error "log_put( OLDRSAM:66, 35052): log record too long" (0) | 2013.02.13 |
Corrupt Page During an Archive (0) | 2013.01.03 |
Database Server Blocked in Checkpoint (Blocked:CKPT) due to disabled dbspace (0) | 2012.07.05 |
Tuning Kernel Asynchronous IO (KAIO) for IBM Informix on AIX (0) | 2012.06.16 |