728x90

Question

Your read cache rate (onstat -p) rises durning server use to ninety-nine percent. You suspect that you have overallocated buffers to the extent that you are wasting memory. How can you determine how much of your buffer cache memory is used?

Answer

Count the number of buffers in use at regular intervals and compare with the total allocated. You can use the following two methods to count the number of buffer pages in use at a given point in time:

  • Execute the onstat -B command. The number of data lines in the output is the number of buffers in use. The onstat -B output includes all the configured buffer pools, grouped by page size. You will have to count the lines in the individual groups if you have more than one buffer pool.


  • Open the sysmaster database and execute the following query:

    SELECT COUNT(*) AS pages_used
    FROM sysbufhdr
    WHERE bufsize = <buffer_page_size>
    AND bflags != 4;

    The buffer_page_size is the page size (in bytes) for the buffer pool you are examining. The buffer page size is listed in the onstat -b/B output. If you only have one buffer pool, you can leave out the buffer page size filter and the query becomes:

    SELECT COUNT(*) AS pages_used
    FROM sysbufhdr
    WHERE bflags != 4;

 

https://www.ibm.com/support/pages/how-do-i-know-if-my-buffers-are-overallocated

 

How do I know if my Buffers are Overallocated?

How do I know if my Buffers are Overallocated?

www.ibm.com

 

728x90
728x90

Question

Your IBM Informix server is processing queries more slowly than expected. You want to determine what is slowing the instance. Were do you start your analysis?

Answer

Preconditions

  • This performance problem is not limited to one query or a class of queries.
  • The database statistics (UPDATE STATISTICS) were collected with appropriate performance considerations and are current.
  • The performance is a problem for both network connections to the server and shared memory connections.

Recommended Approach

Informix servers provides a number of commands, the onstats, for reporting system status. You can generally start your analysis using seven of the commands. Examination of the data often leads to further analysis and resolution. The recommended commands are:

  • onstat -m

    Message log. The message log contains warning and error messages. The onstat -m only reports recent updates to the log file. You may want to look at earlier log entries.

  • onstat -u

    User thread status. Look for user threads waiting on a resource. You will see a wait indicator in the first position of the flags field. The address of the resource is in the wait field. The specific wait flag, the type of resource, and cross references follow:

    B - wait on buffer - match the wait address to a buffer in onstat -b

    C - wait on checkpoint - examine onstat -g ckp and onstat -g con

    G - wait on write to log buffer - match the wait address to a log buffer in onstat -l

    L - wait on lock - match the wait address to the address of a lock in onstat -k

    S - wait on latch - match the wait address to a latch (mutex) in onstat -lmx and onstat -wmx

    Y - wait on condition - listed in onstat -g con and do not typically reflect performance problems to the extent that they help with diagnosis

    There are several other flags but they are rarely observed.

    The first field of the onstat -u output, address, maps to the rstcb field of the onstat -g ath output for thread identification. The sessid field is the session id (SID). You can learn more about resources allocated to a session and its activity with onstat -g ses <SID>.

    Collect onstat -u several times in rapid succession. If the waiters persist over a measurable clock time, then the chances are very good that the waits reflect a processing problem that affects performance. Some waiters are normal but they should disappear quickly. Keep in mind that lock waits are programmed.

    The last two fields of onstat -u, nreads and nwrites, can be useful indicators of thread activity.

  • onstat -p

    Server statistics. The single most important performance statistic in this output is the read cache rate (the first %cached field). Your goal for an OLTP system is to have a read cache rate above 95 percent during normal processing. Try for a rate above 99 percent. Increase the cache rate by adding buffers, which are configured using the BUFFERPOOL configuration parameter. Make sure that you have plenty of system memory (onstat -g osi) when increasing the Informix server memory allocation. Increasing the server memory footprint can indirectly slow the instance by increasing paging/swapping at the OS side. You can use a sysmaster query (see Related Information below) to help determine if you have configured too many buffers.

    Other statistics, like bufwaits (waiting for buffer), seqscans (sequential scans), and the read aheads should be considered. In the case of read aheads, the sum of ixda-RA, idx-RA, and da-RA should be close to the value of RA-pgsused as an indicator of effective read-ahead configuration.

    Many of the same statistics are viewed on a partnum level with onstat -g ppf.

    Consider collecting and retaining onstat -p outputs at regular intervals for comparison. Note that cumulative statistics are zeroed with onstat -z if you want to observe statistics over a limited time interval.

  • onstat -g rea

    Ready queue. Reports threads ready to execute at one moment in time. These threads are prepared for execution but lack cpu resource. If the number remains above the number of cpu virtual processors (onstat -g glo) through repetitions of onstat -g rea, then your system is likely limited by processing power. Look for inefficient queries and non-server processing on the machine. See onstat -g glo output for cpu use integrated over time and onstat -g osi for system resources.

  • onstat -g act

    Active threads. You will usually see poll threads and listen threads in this queue. Look for threads doing query-related work, like sqlexec and I/O threads. If none show up or they are rare, then work is not getting to the processors.

  • onstat -g ioq

    I/O request queues. The statistic to monitor is the maxlen column. This is the maximum length of a queue after the engine was brought online or onstat -z was executed. If the number is too large, then at some point the I/O requests were not serviced fast enough. Try executing onstat -z and checking to see how long it takes for large maxlen values to return.

    For informix aio monitor the gfd (global file descriptor) queues. The maxlen should not be greater than 25. The system uses gfd 0, 1, and 2 (stdin, stdout, stderr), so the informix chunks start with gfd 3 (chunk 1).

    If you are using kaio monitor the kio queues. The maxlen values should not exceed 32. There will be one kio queue for each cpu virtual processor.

    Recommendations for maxlen with DIRECT_IO enabled have not been determined, but are not expected to be larger than 32.

  • onstat -g seg

    Shared memory allocations. The shared memory configuration parameters should be tuned so that the server dynamically allocates at most two virtual segments during normal processing.

 

 

https://www.ibm.com/support/pages/where-do-i-start-diagnosis-generally-slow-performance-informix-servers

 

Where do I start with Diagnosis of Generally Slow Performance on Informix Servers?

Where do I start with Diagnosis of Generally Slow Performance on Informix Servers?

www.ibm.com

 

728x90
728x90

얼마전에 고객사 인포믹스에서 243 오류(Could not position within a table table-name)가 발생해서 onmode -I 명령으로 동일한 오류가 발생했을 때 진단 정보를 수집하도록 설정해두었습니다. 

 

그런데 진단 정보를 수집하는 설정 상태를 확인하는 방법이 있는지 알고싶어 IBM Community에 질문 글을 썼습니다.

onmode -I 명령을 수행하면 아래와 같이 인포믹스 온라인 로그에 메시지가 표시되기는 하지만 현재 설정되었는지 여부를 확인하는 방법을 알고 싶었습니다.

11:55:15  Verbose error trapping set, errno = 243, session_id = -1

 

질문 글을 올리고 금방 답변을 받았는데.. onstat 명령에 숨겨진 옵션이 있더군요.

$ onmode -I 243
$ onstat -g ras
IBM Informix Dynamic Server Version 11.70.FC9 -- On-Line -- Up 01:07:23 -- 23486904 Kbytes
 Diagnostics           Current Settings
 AFCRASH               0x00002481
 AFFAIL                0x00000401
 AFWARN                0x00000401
 AFDEBUG               0
 AFLINES               0
 BLOCKTIMEOUT          3600
 Block Status          Unblocked
 CCFLAGS               0x00000000
 CCFLAGS2              0x00000200
 DUMPCORE              0
 DUMPGCORE             0
 DUMPSHMEM             1
 DUMPCNT               1
 DUMPDIR               /work1/informix/ids1150fc6/tmp
 RHEAD_T               0x00000000541c3800
 SYSALARMPROGRAM       /work1/informix/ids1150fc6/etc/evidence.sh
 LTXHWM                70%
 LTXEHWM               80%
 DYNAMIC_LOGS          2
Traperror info:
 Errno          Session(s)
 243            All

onstat -g ras 명령의 출력 결과입니다. -g ras 옵션은 onstat 도움말과 IBM의 매뉴얼에 설명되지 않은 숨겨진 옵션입니다. 결과의 Traperror info 부분에서 현재 에러번호(Errno)와 특정 세션번호에 대해 진단 정보를 수집하도록 설정되어 있음을 확인할 수 있습니다.

728x90
728x90

6월 23일자로 인포믹스 14.10 버전의 새로운 Fix Pack 14.10.xC4W1이 공개되었습니다.

아직 문서에 새로운 기능 소개는 나오지 않아서 onstat 유틸리티에 새로운 옵션이 생겼나 찾아보니 하나가 눈에 띄네요.

 

onstat -g top이라는 옵션이 새로 생겼습니다. 리눅스의 top과 유사한 기능을 제공하는 옵션인데요. 기본값으로 5초간의 성능수치를 비교하여 시스템 자원을 많이 사용하는 세션이나 스레드를 확인할 수 있습니다. 시간 간격이나 표시할 라인 수, 반복 횟수를 지정할 수 있습니다.

아래는 onstat 명령을 실행했을 때의 top 옵션 설명입니다.

        top [ <entity> <stat> [ <max lines> [ <intvl> [ <reps> ]]]]
            Print top consumers of various resources over specified interval
            Valid <entity> <stat> combinations:
                thread    cpu   (CPU usage)
                thread    drd   (disk reads)
                thread    bfr   (buffer reads)
                thread    bfw   (buffer writes)
                thread    plg   (physical log usage)
                thread    llg   (logical log usage)
                session   cpu   (CPU usage)
                session   drd   (disk reads)
                session   bfr   (buffer reads)
                session   bfw   (buffer writes)
                session   plg   (physical log usage)
                session   llg   (logical log usage)
                chunk     ios   (page reads/writes)
                chunk     art   (average read times)
                chunk     awt   (average write times)
                space     ios   (page reads/writes)
                space     art   (average read times)
                space     awt   (average write times)
                mempool   gro   (memory growth)
                sessmem   gro   (memory growth)
                partition drd   (disk reads)
                table     drd   (disk reads)

위에 나열된 항목을 전부 또는 지정해서 볼 수 있습니다.

watch 명령을 이용해서 인포믹스를 모니터링 하는 것도 괜찮아 보입니다.

아래는 제가 약간의 트랜잭션을 발생시키고 나서 top 옵션으로 본 화면입니다.

[informix@db2 skjeong]$ onstat -g top 10 3
IBM Informix Dynamic Server Version 14.10.FC4W1DE -- On-Line -- Up 18:46:36 -- 2631308 Kbytes
Top Resource Usage (Max lines 10, Time interval 3 seconds):
Top Threads (CPU usage)
 tid    name              sid        CPU_time   #scheds  status
 343    sqlexec           208          0.0014         6  sleeping secs: 1
 344    sqlexec           209          0.0009         6  sleeping secs: 1
 345    sqlexec           210          0.0009         6  sleeping secs: 1
 19     periodic          14           0.0001         5  sleeping secs: 1
Top pools (memory growth)
 name                           increase(b)      total_size(b)
 rsam                           8                3081712
(No partition disk reads to display)
(No DBspace activity to display)
(No physlog activity to display)
Top threads (logical log usage)
 tid      rstcb              llog_bytes   name
 344      0x459af1a8         552          sqlexec
 343      0x459afa88         552          sqlexec
 345      0x459b0c48         552          sqlexec

기존에 스크립트로 모니터링 하시는 분들도 계시겠지만, onstat 명령으로 본다면 좀 더 시스템에 부담은 덜 줄 것 같군요. 실시간 모니터링에 많은 도움이 될 것 같습니다.

 

다른 유용한 기능이 더 생겼는지 찾아봐야겠네요.

728x90

+ Recent posts