Problem(Abstract)
Sometime you can get "ping timeout" and "send error" in online.log,and check network environment,which are all normal.Last HDR relation had been broken.Why do it occur ?
Symptom
ping timeout,received error,send error
Cause
ping timeout will occur if "DR_MSG_PING" can't flow between primary and secondary,or ack duration exceed 4*DRTIMEOUT."ping timeout" is a message type in DR BUFFER QUEUE,so it require waiting for dr buffer space,therefor PING TIMEOUT maybe occur due to dr buffer is full or its priority is too lower than logical log buffer.
Logical log buffer can't be transfer maybe lead to the 'ping timeout'.
Environment
HDR environment
Diagnosing the problem
DR BUFFER size is same as logical log buffer,and "DR_MSG_PING" save in dr buffer,so we can configure LOGBUFF to adjust DR BUFFER.
Primary server send logical log to Secondary server to keep consistent data as following description.
1.primary : logical log buffer -> dr buffer
2.primary : dr_prsend thread send these logical log to dr buffer on secondary server across network using TCP/IP.
3.secondary:dr_secrecv thread received those logical log in secondary.
HDR primary server and secondary server will ping each other and must waiting for a acknowledgment during a appointed times ,otherwise HDR relation will be broken due to "ping timeout" error .The ack duration is 4 times as DRTIMEOUT value.
Resolving the problem
To avoid "ping timeout" occur according to following mention.
1.Increasing LOGBUFF value to adjust dr buffer size and lay more signal message.
2.Secondary server hang maybe lead to the "ping timeout" due to dr buffer can't be received immediately or DR_MSG_PING lower priority.
3.Long checkpoint duration in secondary server.
http://www-01.ibm.com/support/docview.wss?uid=swg21413380