728x90

Question

Sometimes users might notice their sessions are stuck in "Commit Active" state for over a period of time and then the state goes away. What could be the possible reason for that ?

Cause

Session in "Commit Active" means it's performing the commit of a Unit of Work, but not being able to finish the same yet for some reasons.

Answer

Out of many possible reasons following could be some common ones :

- The database is using HADR and at the standby side it's hitting a momentary HADR log receive buffer full situation.

- Network issues affecting HADR communications between primary and standby

- The database has very large history file

- Disk bottleneck at the storage side where the database transaction logs are kept

 

 

To identify HADR receive buffer full situation just run "db2pd -hadr" at the standby side while the "Commit Active" state is present at the primary side of the HADR and see what the output of StandByRcvBufUsed field shows. If it's near 100% then it can cause the "Commit Active" situation in the primary side. Consider increasing DB2_HADR_BUF_SIZE registry setting to address the issue.

If a network issue is suspected between two sides of HADR pair then that should be taken to the network administrator to check on possible network bottleneck or any other kinds of network issues.

To check whether history file have become too big. Check for the file db2rhist.asc file under database directory. If it's big (example, more than 100MB or so) then try to prune it down.

Bottleneck in disk containing the active log directory or, mirror log path can cause the "Commit Active" situation most. Usually, iostat output collected while the "Commit Active" is in place can show if the disk bottleneck is in place. As for example, under AIX operating system collecting "iostat -TDla 1 30" might show the column "%tm" reaching near 100% for the disk containing log paths. If the %tm is 100% and all the data flow in that disk shows zero, that will indicate a momentary stalled disk situation.


 

http://www-01.ibm.com/support/docview.wss?uid=swg21650932&myns=swgimgmt&mynp=OCSSEPGG&mync=E

728x90
728x90

DB2에는 복구 히스토리 파일(db2rhist.asc)이 있는데 주로 테이블스페이스 생성, 삭제, 로그 아카이빙 등과 같은 유틸리티성 작업이 기록된다. 이 파일이 일정 크기 이상으로 커지면 Commit이 완료되는데 오래 걸리는 현상이 발생한다. 

아래 포스트를 보면 DPF환경에서 히스토리 파일이 약 240MB인 경우에 트랜잭션 행(hang)현상이 발생했다고 한다.



The history file in DB2 is a critical file that is usually overlooked.


Earlier this week I work with a customer that had been experiencing hangs on certain partitions (Yes this is DPF) daily since upgrading to a later fixpack, and intermittently on the earlier fixpack.


We found that DB2 was stuck in uninteruptable kernel code. As this was Linux, the usual tools to get a kernel trace were not available. Looking at the db2support provided by the customer, I saw that the db2rhist.asc (aka history file) was > 240MB.


The customer was advised to prune the history file, so that it only contained 1o days of history. This resulted in the db2rhist.asc only being 8 MB (which is still reasonable big). Since the pruning of the history file no further hangs have been experienced to date.


There are numerous cases where a large history file causes a slow down and even hang like symptoms. In a majority of cases where there is high load activity the history file can grow quickly and there are DB2 internal options so that Loads are not logged in the history file - see DB2 support for details.


It is good practice to keep the history file as small as possible.



좀 심한 경우에는 600MB까지 늘어난 경우도 있다.


Dear all,
I wrote some weeks ago about a problem with all the committing transactions blocked for many seconds in a Commit Active state. Well, finally I solved it and I just would like to share this information.

 As I suspected there was a lock. I suspected log writing due to thecommit active state and I was right. Using truss on the db2loggw process I found that when the db2 was blocked it was waiting on a Unix semaphore. I didn't find a way to find on AIX a command that gives you the pid of the semaphore locking process (you have the last operation process pid, that was the db2loggw itself). This semaphore lock appear just after a statx on
the db2rhist.asc file, so I tried to understand what was happening with that file.

Oh well, the file was pretty big: 600MB. I suspected the db2logmgr and infact it was accessing the file during the lock. once again, truss helped me. It was reading/seeking into the file... when it finished it did a semop on the same semaphore (an unlock obviously) and the db2loggw started to write again and everything was unblocked and working correctly.

I removed the db2rhist.asc and the db2rhist.bak and reduced the REC_HIS_RETENTN variable to 7 days (we keep backup for less than a week) while the default is 366...

Anyway I have some questions for any DB2 folk around here. First of all anything happened just when there was an asnapply or asncap committing transaction on the database (the replication processes). Second is that I do not understand while the db2logmgr must read so many data from the db2rhist.asc: is it responsible of the history rotation process or whatever?

Thanks a lot to everybody and bye.

Matteo Gelosa
I.NET S.p.A.



어쩄든 History FIle 에 액세스하는 동안 Commit이 느려진다는 점은 IBM문서에서 나와있으나, 어느정도 크기 이상이 되었을 때 성능에 문제가 있다던지 하는 기준은 없다. 가능하면 작게 유지하는 것이 좋다는 게 공식적인 입장이다.


다만 최근에 출시된 10.1 버전 Fixpack 2 부터는 History FIle이 커져도 성능에 영향을 주지 않도록 변경된 것으로 보인다.



http://publib.boulder.ibm.com/infocenter/db2luw/v10r1/topic/com.ibm.db2.luw.wn.doc/doc/c0060645.html

http://veryuglyboxer.blogspot.kr/2010/01/history-file.html

http://www.dbforums.com/db2/1652165-applicaiton-staying-long-time-commit-active-state.html

http://database.ittoolbox.com/groups/technical-functional/db2-l/application-in-commit-active-status-4428765

http://dbaspot.com/ibm-db2/195591-commit-active-problem-solved-db2rhist-asc.html

http://www-01.ibm.com/support/docview.wss?uid=swg21575692




728x90
728x90



Problem(Abstract)

A very large db2rhist.asc file can cause performance degradation to DML that modifies tables.

Symptom

Poor performance during inserts, updates and deletes. The application snapshots will show applications in COMMIT-ACTIVE longer than it should


Cause

During a log archive, DB2 writes to the db2rhist.asc file. As the file gets larger, the writes takes longer than usual. During this time any inserts, updates and deletes attempting to COMMIT during a log switch have to wait for the db2rhist.asc update to complete. This can momentarily 'freeze' the system for a few seconds.

This problem usually affects customers who use flash copy to backup the database. As DB2 backup is not invoked directly, the db2rhist.asc does not get pruned despite the RETENTION setting.


Diagnosing the problem

Disable HADR for isolation, as it may cause COMMIT-ACTIVE as well

 

Resolving the problem

Reduce the size of the db2rhist.asc via the prune force command periodically


http://www-01.ibm.com/support/docview.wss?uid=swg21575692

http://dbaspot.com/ibm-db2/195591-commit-active-problem-solved-db2rhist-asc.html


728x90

+ Recent posts