728x90

Environment

  • Red Hat Enterprise Linux release 8.7 (Ootpa)
  • 4.18.0-425.10.1.el8_7.x86_64
  • EXT4 filesystem without a journal.

Issue

  • IBM Informix Dynamic Server oninit process hanging, resulting in unresponsive/hung database connections due to deadlock on ext4 filesystem mounted without a journal.
  • IBM Informix Dynamic Server oninit process blocked with the following call traces.
[428617.156430] INFO: task oninit:4327 blocked for more than 120 seconds.
[428617.156431]       Not tainted 4.18.0-425.10.1.el8_7.x86_64 #1
[428617.156433] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[428617.156433] task:oninit          state:D stack:    0 pid: 4327 ppid:  4300 flags:0x00004084
[428617.156436] Call Trace:
[428617.156437]  __schedule+0x2d1/0x860
[428617.156439]  ? blk_flush_plug_list+0xd7/0x100
[428617.156441]  schedule+0x35/0xa0
[428617.156443]  ext4_unwritten_wait+0x87/0xc0 [ext4]
[428617.156460]  ? finish_wait+0x80/0x80
[428617.156462]  ext4_file_write_iter+0x396/0x3e0 [ext4]
[428617.156478]  aio_write+0xf6/0x1c0
[428617.156480]  ? blk_mq_sched_insert_requests+0x6c/0xf0
[428617.156483]  ? io_submit_one+0x7e/0x3c0
[428617.156484]  ? kmem_cache_alloc+0x13f/0x280
[428617.156486]  io_submit_one+0x131/0x3c0
[428617.156488]  __x64_sys_io_submit+0xa2/0x180
[428617.156489]  ? syscall_trace_enter+0x1ff/0x2d0
[428617.156492]  do_syscall_64+0x5b/0x1b0
[428617.156494]  entry_SYSCALL_64_after_hwframe+0x61/0xc6
[428617.156497] RIP: 0033:0x7fd1d50409bd
[428617.156499] Code: Unable to access opcode bytes at RIP 0x7fd1d5040993.
[428617.156500] RSP: 002b:00000000531c4da8 EFLAGS: 00000246 ORIG_RAX: 00000000000000d1
[428617.156501] RAX: ffffffffffffffda RBX: 00007fd1d698ab78 RCX: 00007fd1d50409bd
[428617.156502] RDX: 00000000531c4de8 RSI: 0000000000000001 RDI: 00007fd1c4593000
[428617.156503] RBP: 00007fd1c4593000 R08: 00000001358df800 R09: 000000005667b8d8
[428617.156504] R10: ffffffffffffffe0 R11: 0000000000000246 R12: 0000000000000001
[428617.156505] R13: 0000000000000000 R14: 00000000531c4de8 R15: 000000005667b850
[..]
[428617.158050] Kernel panic - not syncing: hung_task: blocked tasks
[428617.158052] CPU: 7 PID: 126 Comm: khungtaskd Kdump: loaded Not tainted 4.18.0-425.10.1.el8_7.x86_64 #1
[428617.158053] Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.18227214.B64.2106252220 06/25/2021
[428617.158054] Call Trace:
[428617.158055]  dump_stack+0x41/0x60
[428617.158058]  panic+0xe7/0x2ac
[428617.158062]  watchdog+0x242/0x2f0
[428617.158065]  ? hungtask_pm_notify+0x50/0x50
[428617.158068]  kthread+0x10b/0x130
[428617.158070]  ? set_kthread_struct+0x50/0x50
[428617.158073]  ret_from_fork+0x1f/0x40

Resolution

  • This issue is currently being tracked in Red Hat private Bug ID #2179063
    [Private] Bug 2179063 - Database process hanging due to deadlock on ext4 filesystem mounted without a journal.

Root Cause

  • The symptoms are indicative of a known issue that has been addressed by upstream commit: 3eaf9cc62f447
$ git show 3eaf9cc62f447
commit 3eaf9cc62f447a742b26fa601993e94406aa1ea1
Author: Matthew Bobrowski <mbobrowski@mbobrowski.org>
Date:   Tue Nov 5 23:02:23 2019 +1100

    ext4: update ext4_sync_file() to not use __generic_file_fsync()

    When the filesystem is created without a journal, we eventually call
    into __generic_file_fsync() in order to write out all the modified
    in-core data to the permanent storage device. This function happens to
    try and obtain an inode_lock() while synchronizing the files buffer
    and it's associated metadata.

    Generally, this is fine, however it becomes a problem when there is
    higher level code that has already obtained an inode_lock() as this
    leads to a recursive lock situation. This case is especially true when
    porting across direct I/O to iomap infrastructure as we obtain an
    inode_lock() early on in the I/O within ext4_dio_write_iter() and hold
    it until the I/O has been completed. Consequently, to not run into
    this specific issue, we move away from calling into
    __generic_file_fsync() and perform the necessary synchronization tasks
    within ext4_sync_file().

    Signed-off-by: Matthew Bobrowski <mbobrowski@mbobrowski.org>
    Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Link: https://lore.kernel.org/r/3495f35ef67f2021b567e28e6f59222e583689b8.1572949325.git.mbobrowski@mbobrowski.org
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>

diff --git a/fs/ext4/fsync.c b/fs/ext4/fsync.c
index 5508baa11bb6..e10206e7f4bb 100644
--- a/fs/ext4/fsync.c
+++ b/fs/ext4/fsync.c
@@ -80,6 +80,43 @@ static int ext4_sync_parent(struct inode *inode)
        return ret;
 }

+static int ext4_fsync_nojournal(struct inode *inode, bool datasync,
+                               bool *needs_barrier)
+{
+       int ret, err;
+
+       ret = sync_mapping_buffers(inode->i_mapping);
+       if (!(inode->i_state & I_DIRTY_ALL))
+               return ret;
+       if (datasync && !(inode->i_state & I_DIRTY_DATASYNC))
+               return ret;
+
+       err = sync_inode_metadata(inode, 1);
+       if (!ret)
+               ret = err;
+
+       if (!ret)
+               ret = ext4_sync_parent(inode);
+       if (test_opt(inode->i_sb, BARRIER))
+               *needs_barrier = true;
+
+       return ret;
+}
+
+static int ext4_fsync_journal(struct inode *inode, bool datasync,
+                            bool *needs_barrier)
+{
+       struct ext4_inode_info *ei = EXT4_I(inode);
+       journal_t *journal = EXT4_SB(inode->i_sb)->s_journal;
+       tid_t commit_tid = datasync ? ei->i_datasync_tid : ei->i_sync_tid;
+
+       if (journal->j_flags & JBD2_BARRIER &&
+           !jbd2_trans_will_send_data_barrier(journal, commit_tid))
+               *needs_barrier = true;
+
+       return jbd2_complete_transaction(journal, commit_tid);
+}
+
 /*
  * akpm: A new design for ext4_sync_file().
  *
@@ -91,17 +128,14 @@ static int ext4_sync_parent(struct inode *inode)
  * What we do is just kick off a commit and wait on it.  This will snapshot the
  * inode to disk.
  */
-
 int ext4_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
 {
-       struct inode *inode = file->f_mapping->host;
-       struct ext4_inode_info *ei = EXT4_I(inode);
-       journal_t *journal = EXT4_SB(inode->i_sb)->s_journal;
        int ret = 0, err;
-       tid_t commit_tid;
        bool needs_barrier = false;
+       struct inode *inode = file->f_mapping->host;
+       struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);

-       if (unlikely(ext4_forced_shutdown(EXT4_SB(inode->i_sb))))
+       if (unlikely(ext4_forced_shutdown(sbi)))
                return -EIO;

        J_ASSERT(ext4_journal_current_handle() == NULL);
@@ -111,23 +145,15 @@ int ext4_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
        if (sb_rdonly(inode->i_sb)) {
                /* Make sure that we read updated s_mount_flags value */
                smp_rmb();
-               if (EXT4_SB(inode->i_sb)->s_mount_flags & EXT4_MF_FS_ABORTED)
+               if (sbi->s_mount_flags & EXT4_MF_FS_ABORTED)
                        ret = -EROFS;
                goto out;
        }

-       if (!journal) {
-               ret = __generic_file_fsync(file, start, end, datasync);
-               if (!ret)
-                       ret = ext4_sync_parent(inode);
-               if (test_opt(inode->i_sb, BARRIER))
-                       goto issue_flush;
-               goto out;
-       }
-
        ret = file_write_and_wait_range(file, start, end);
        if (ret)
                return ret;
+
        /*
         * data=writeback,ordered:
         *  The caller's filemap_fdatawrite()/wait will sync the data.
@@ -142,18 +168,14 @@ int ext4_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
         *  (they were dirtied by commit).  But that's OK - the blocks are
         *  safe in-journal, which is all fsync() needs to ensure.
         */
-       if (ext4_should_journal_data(inode)) {
+       if (!sbi->s_journal)
+               ret = ext4_fsync_nojournal(inode, datasync, &needs_barrier);
+       else if (ext4_should_journal_data(inode))
                ret = ext4_force_commit(inode->i_sb);
-               goto out;
-       }
+       else
+               ret = ext4_fsync_journal(inode, datasync, &needs_barrier);

-       commit_tid = datasync ? ei->i_datasync_tid : ei->i_sync_tid;
-       if (journal->j_flags & JBD2_BARRIER &&
-           !jbd2_trans_will_send_data_barrier(journal, commit_tid))
-               needs_barrier = true;
-       ret = jbd2_complete_transaction(journal, commit_tid);
        if (needs_barrier) {
-       issue_flush:
                err = blkdev_issue_flush(inode->i_sb->s_bdev, GFP_KERNEL, NULL);
                if (!ret)
                        ret = err;
</tytso@mit.edu></jack@suse.cz></riteshh@linux.ibm.com></mbobrowski@mbobrowski.org></mbobrowski@mbobrowski.org>

Diagnostic Steps

  • A crash dump (VMcore) captured by enabling sysctl hung_task_panic indicates that there is a deadlock between the oninit task PID 4327 waiting for ext4_inode_info->i_unwritten to reach 0 and a rescuer worker thread dio/dm-8 PID 4585 waiting for the rw_semaphore lock held by the "oninit" task PID 4327 waiting for "ext4_inode_info->i_unwritten" to reach 0.

Kernel Crash Dump Analysis:

  • System Information:
crash> sys | grep -e RELEASE -e PANIC
     RELEASE: 4.18.0-425.10.1.el8_7.x86_64
       PANIC: "Kernel panic - not syncing: hung_task: blocked tasks"

crash> sys -i | grep DMI_SYS_VENDOR: -A 1
         DMI_SYS_VENDOR: VMware, Inc.
       DMI_PRODUCT_NAME: VMware7,1
  • Kernel Ring Buffer:
crash> log
[..]
[428617.156430] INFO: task oninit:4327 blocked for more than 120 seconds.
[428617.156431]       Not tainted 4.18.0-425.10.1.el8_7.x86_64 #1
[428617.156433] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[428617.156433] task:oninit          state:D stack:    0 pid: 4327 ppid:  4300 flags:0x00004084
[428617.156436] Call Trace:
[428617.156437]  __schedule+0x2d1/0x860
[428617.156439]  ? blk_flush_plug_list+0xd7/0x100
[428617.156441]  schedule+0x35/0xa0
[428617.156443]  ext4_unwritten_wait+0x87/0xc0 [ext4]
[428617.156460]  ? finish_wait+0x80/0x80
[428617.156462]  ext4_file_write_iter+0x396/0x3e0 [ext4]
[428617.156478]  aio_write+0xf6/0x1c0
[428617.156480]  ? blk_mq_sched_insert_requests+0x6c/0xf0
[428617.156483]  ? io_submit_one+0x7e/0x3c0
[428617.156484]  ? kmem_cache_alloc+0x13f/0x280
[428617.156486]  io_submit_one+0x131/0x3c0
[428617.156488]  __x64_sys_io_submit+0xa2/0x180
[428617.156489]  ? syscall_trace_enter+0x1ff/0x2d0
[428617.156492]  do_syscall_64+0x5b/0x1b0
[428617.156494]  entry_SYSCALL_64_after_hwframe+0x61/0xc6
[428617.156497] RIP: 0033:0x7fd1d50409bd
[428617.156499] Code: Unable to access opcode bytes at RIP 0x7fd1d5040993.
[428617.156500] RSP: 002b:00000000531c4da8 EFLAGS: 00000246 ORIG_RAX: 00000000000000d1
[428617.156501] RAX: ffffffffffffffda RBX: 00007fd1d698ab78 RCX: 00007fd1d50409bd
[428617.156502] RDX: 00000000531c4de8 RSI: 0000000000000001 RDI: 00007fd1c4593000
[428617.156503] RBP: 00007fd1c4593000 R08: 00000001358df800 R09: 000000005667b8d8
[428617.156504] R10: ffffffffffffffe0 R11: 0000000000000246 R12: 0000000000000001
[428617.156505] R13: 0000000000000000 R14: 00000000531c4de8 R15: 000000005667b850
[..]
[428617.158050] Kernel panic - not syncing: hung_task: blocked tasks
[428617.158052] CPU: 7 PID: 126 Comm: khungtaskd Kdump: loaded Not tainted 4.18.0-425.10.1.el8_7.x86_64 #1
[428617.158053] Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.18227214.B64.2106252220 06/25/2021
[428617.158054] Call Trace:
[428617.158055]  dump_stack+0x41/0x60
[428617.158058]  panic+0xe7/0x2ac
[428617.158062]  watchdog+0x242/0x2f0
[428617.158065]  ? hungtask_pm_notify+0x50/0x50
[428617.158068]  kthread+0x10b/0x130
[428617.158070]  ? set_kthread_struct+0x50/0x50
[428617.158073]  ret_from_fork+0x1f/0x40
[..]
  • The oninit task PID: 4327 has been in the uninterruptible (UN) state for 2 minutes and 35 seconds.
crash> set 4327
    PID: 4327
COMMAND: "oninit"
   TASK: ffff9fc84c518000  [THREAD_INFO: ffff9fc84c518000]
    CPU: 9
  STATE: TASK_UNINTERRUPTIBLE 

crash> px ((struct file *)((struct task_struct *)0xffff9fc84c518000)->mm->exe_file)->f_path.dentry
$4 = (struct dentry *) 0xffff9fc8fc257e00

crash> files -d 0xffff9fc8fc257e00
     DENTRY           INODE           SUPERBLK     TYPE PATH
ffff9fc8fc257e00 ffff9fc8fc178120 ffff9fc85014f000 REG  /users4/ifxdb_14.10.FC8_WE/bin/oninit

crash> ps -m 4327
[0 00:02:35.486] [UN]  PID: 4327     TASK: ffff9fc84c518000  CPU: 9    COMMAND: "oninit"
  • Backtrace of the oninit task PID: 4327
crash> bt
PID: 4327     TASK: ffff9fc84c518000  CPU: 9    COMMAND: "oninit"
 #0 [ffffb2fd13a77bf0] __schedule at ffffffffbb9eb121
 #1 [ffffb2fd13a77c80] schedule at ffffffffbb9eb6e5
 #2 [ffffb2fd13a77c90] ext4_unwritten_wait at ffffffffc08b7317 [ext4]
 #3 [ffffb2fd13a77cd8] ext4_file_write_iter at ffffffffc08b7b46 [ext4]
 #4 [ffffb2fd13a77d38] aio_write at ffffffffbb3b9e16
 #5 [ffffb2fd13a77e40] io_submit_one at ffffffffbb3ba1a1
 #6 [ffffb2fd13a77eb8] __x64_sys_io_submit at ffffffffbb3ba7b2
 #7 [ffffb2fd13a77f38] do_syscall_64 at ffffffffbb0043ab
 #8 [ffffb2fd13a77f50] entry_SYSCALL_64_after_hwframe at ffffffffbba000a9
    RIP: 00007fd1d50409bd  RSP: 00000000531c4da8  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 00007fd1d698ab78  RCX: 00007fd1d50409bd
    RDX: 00000000531c4de8  RSI: 0000000000000001  RDI: 00007fd1c4593000
    RBP: 00007fd1c4593000   R8: 00000001358df800   R9: 000000005667b8d8
    R10: ffffffffffffffe0  R11: 0000000000000246  R12: 0000000000000001
    R13: 0000000000000000  R14: 00000000531c4de8  R15: 000000005667b850
    ORIG_RAX: 00000000000000d1  CS: 0033  SS: 002b
  • The oninit task PID: 4327 was waiting for an ext4_inode_info->i_unwritten to reach 0 for an ext4 inode:
crash> dis -rl ffffffffc08b7317 | tail -n 5
/usr/src/debug/kernel-4.18.0-425.10.1.el8_7/linux-4.18.0-425.10.1.el8_7.x86_64/fs/ext4/file.c: 111
0xffffffffc08b730b <ext4_unwritten_wait+123>:   call   0xffffffffbb1447b0 <init_wait_entry>
0xffffffffc08b7310 <ext4_unwritten_wait+128>:   jmp    0xffffffffc08b7317 <ext4_unwritten_wait+135>
0xffffffffc08b7312 <ext4_unwritten_wait+130>:   call   0xffffffffbb9eb6b0 <schedule>
0xffffffffc08b7317 <ext4_unwritten_wait+135>:   mov    $0x2,%edx

Kernel Source: fs/ext4/file.c
*****************************
218 static ssize_t
219 ext4_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
220 {
221         struct inode *inode = file_inode(iocb->ki_filp);
222         int o_direct = iocb->ki_flags & IOCB_DIRECT;
223         int unaligned_aio = 0;
224         int overwrite = 0;
225         ssize_t ret;
...
273         /*
274          * Unaligned direct AIO must be the only IO in flight. Otherwise
275          * overlapping aligned IO after unaligned might result in data
276          * corruption.
277          */
278         if (ret == -EIOCBQUEUED && unaligned_aio)
279                 ext4_unwritten_wait(inode);
...

Kernel Source: fs/ext4/file.c
*****************************
107 static void ext4_unwritten_wait(struct inode *inode)
108 {
109         wait_queue_head_t *wq = ext4_ioend_wq(inode);
110 
111         wait_event(*wq, (atomic_read(&EXT4_I(inode)->i_unwritten) == 0));

crash> px ext4_unwritten_wait
ext4_unwritten_wait = $2 = 
 {void (struct inode *)} 0xffffffffc08b7290 <ext4_unwritten_wait>

crash> dis -r ffffffffc08b7b46 | tail -n 3
0xffffffffc08b7b3e <ext4_file_write_iter+910>:  mov    %rbp,%rdi
0xffffffffc08b7b41 <ext4_file_write_iter+913>:  call   0xffffffffc08b7290 <ext4_unwritten_wait>
0xffffffffc08b7b46 <ext4_file_write_iter+918>:  mov    %r14,%rdi

crash> bt -f | grep ext4_file_write_iter
 #3 [ffffb2fd13a77cd8] ext4_file_write_iter at ffffffffc08b7b46 [ext4]

crash> px (0xffffb2fd13a77cd8-0x8)
$5 = 0xffffb2fd13a77cd0

crash> rd 0xffffb2fd13a77cd0
ffffb2fd13a77cd0:  ffff9fca42897788                    .w.B....

o struct inode * == 0xffff9fca42897788

crash> struct ext4_inode_info.vfs_inode  -ox
struct ext4_inode_info {
   [0xe8] struct inode vfs_inode;
}

crash> px (0xffff9fca42897788-0xe8)
$6 = 0xffff9fca428976a0

o struct ext4_inode_info * == 0xffff9fca428976a0

crash> struct ext4_inode_info.i_unwritten 0xffff9fca428976a0
  i_unwritten = {
    counter = 1
  },

crash> struct inode.i_dentry 0xffff9fca42897788
    i_dentry = {
      first = 0xffff9fc85a8cc8f0
    },

crash> px (0xffff9fc85a8cc8f0-0xb0)
$7 = 0xffff9fc85a8cc840

o struct dentry * == 0xffff9fc85a8cc840

crash> files -d 0xffff9fc85a8cc840
     DENTRY           INODE           SUPERBLK     TYPE PATH
ffff9fc85a8cc840 ffff9fca42897788 ffff9fc84bf55800 REG  /informix/data/data3dbs_1
  • The ext4 file system mounted without journalling:
crash> mount | grep -e NAME -e ffff9fc84bf55800
     MOUNT           SUPERBLK     TYPE   DEVNAME   DIRNAME
ffff9fc87136d080 ffff9fc84bf55800 ext4   /dev/mapper/rhel-informix /informix  <<< - - -

crash> px ((struct ext4_sb_info *)((struct super_block *)0xffff9fc84bf55800)->s_fs_info)->s_journal
$8 = (struct journal_s *) 0x0

crash> struct inode.i_sb 0xffff9fca42897788
  i_sb = 0xffff9fc84bf55800,

crash> struct super_block.s_bdev 0xffff9fc84bf55800
  s_bdev = 0xffff9fc880c32d00,

crash> struct block_device.bd_disk 0xffff9fc880c32d00
  bd_disk = 0xffff9fc8706a9000,

crash> struct gendisk.disk_name 0xffff9fc8706a9000
  disk_name = "dm-8\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000",

crash> dmshow --lvs | grep -e NAME -e dm-8
LV DM-X DEV   LV NAME                                   VG NAME                                  OPEN COUNT        LV SIZE (MiB)     PV NAME
dm-8          informix                                  rhel                                              1            512000.00     sda3

crash> dev -d | grep -e NAME -e dm-8
MAJOR GENDISK            NAME       REQUEST_QUEUE      TOTAL ASYNC  SYNC 
  253 ffff9fc8706a9000   dm-8       ffff9fc848c21fb0       0     0     0 
  • Let's verify the rescuer worker thread dio/dm-8 PID: 4585
crash> set 4585
    PID: 4585
COMMAND: "dio/dm-8"
   TASK: ffff9fca0d9ea800  [THREAD_INFO: ffff9fca0d9ea800]
    CPU: 4
  STATE: TASK_UNINTERRUPTIBLE 
  • It has been in the uninterruptible (UN) state for 2 minutes and 35 seconds.
crash> ps -m 4585
[0 00:02:35.436] [UN]  PID: 4585     TASK: ffff9fca0d9ea800  CPU: 4    COMMAND: "dio/dm-8"
  • Backtrace of the dio/dm-8 thread PID: 4585
crash> bt
PID: 4585     TASK: ffff9fca0d9ea800  CPU: 4    COMMAND: "dio/dm-8"
 #0 [ffffb2fd13bd7c98] __schedule at ffffffffbb9eb121
 #1 [ffffb2fd13bd7d28] schedule at ffffffffbb9eb6e5
 #2 [ffffb2fd13bd7d38] rwsem_down_write_slowpath at ffffffffbb15503c
 #3 [ffffb2fd13bd7dc0] __generic_file_fsync at ffffffffbb389f92
 #4 [ffffb2fd13bd7df0] ext4_sync_file at ffffffffc08b9310 [ext4]
 #5 [ffffb2fd13bd7e30] dio_complete at ffffffffbb3a5353
 #6 [ffffb2fd13bd7e68] process_one_work at ffffffffbb110767
 #7 [ffffb2fd13bd7ea8] rescuer_thread at ffffffffbb110b1b
 #8 [ffffb2fd13bd7f10] kthread at ffffffffbb117f8b
 #9 [ffffb2fd13bd7f50] ret_from_fork at ffffffffbba0024f
  • It was waiting for the rw_semaphore (inode->i_rwsem) lock.
crash> dis -rl ffffffffbb389f92 | tail -n 6
/usr/src/debug/kernel-4.18.0-425.10.1.el8_7/linux-4.18.0-425.10.1.el8_7.x86_64/./include/linux/fs.h: 787
0xffffffffbb389f83 <__generic_file_fsync+51>:   lea    0xa8(%rbp),%r14
0xffffffffbb389f8a <__generic_file_fsync+58>:   mov    %r14,%rdi
0xffffffffbb389f8d <__generic_file_fsync+61>:   call   0xffffffffbb9edd50 <down_write>
/usr/src/debug/kernel-4.18.0-425.10.1.el8_7/linux-4.18.0-425.10.1.el8_7.x86_64/fs/libfs.c: 1004
0xffffffffbb389f92 <__generic_file_fsync+66>:   mov    0x30(%rbp),%rdi

Kernel Source: fs/libfs.c
*************************
 992 int __generic_file_fsync(struct file *file, loff_t start, loff_t end,
 993                                  int datasync)
 994 {
 995         struct inode *inode = file->f_mapping->host;
 996         int err;
 997         int ret;
 998 
 999         err = file_write_and_wait_range(file, start, end);
1000         if (err)
1001                 return err;
1002 
1003         inode_lock(inode);

Kernel Source: include/linux/fs.h
*********************************
785 static inline void inode_lock(struct inode *inode)
786 {
787         down_write(&inode->i_rwsem);

crash> dis -r ffffffffc08b9310 | tail -n 3
0xffffffffc08b9308 <ext4_sync_file+456>:    mov    %rbx,%rdi
0xffffffffc08b930b <ext4_sync_file+459>:    call   0xffffffffbb389f50 <__generic_file_fsync>
0xffffffffc08b9310 <ext4_sync_file+464>:    mov    %eax,%r13d

crash> dis -r ffffffffbb389f92 | head -n 7
0xffffffffbb389f50 <__generic_file_fsync>:  nopl   0x0(%rax,%rax,1) [FTRACE NOP]
0xffffffffbb389f55 <__generic_file_fsync+5>:    push   %r14
0xffffffffbb389f57 <__generic_file_fsync+7>:    push   %r13
0xffffffffbb389f59 <__generic_file_fsync+9>:    push   %r12
0xffffffffbb389f5b <__generic_file_fsync+11>:   mov    %ecx,%r12d
0xffffffffbb389f5e <__generic_file_fsync+14>:   push   %rbp
0xffffffffbb389f5f <__generic_file_fsync+15>:   push   %rbx

crash> bt -f | grep __generic_file_fsync -A 4
 #3 [ffffb2fd13bd7dc0] __generic_file_fsync at ffffffffbb389f92
    ffffb2fd13bd7dc8: ffff9fc90b9dde00 ffff9fca42897788 
    ffffb2fd13bd7dd8: 0000000000000001 0000000000000800 
    ffffb2fd13bd7de8: 000000008b6517ff ffffffffc08b9310 
 #4 [ffffb2fd13bd7df0] ext4_sync_file at ffffffffc08b9310 [ext4]

o struct file * == 0xffff9fc90b9dde00

crash> px ((struct file *)0xffff9fc90b9dde00)->f_mapping.host
$9 = (struct inode *) 0xffff9fca42897788

o struct inode * == 0xffff9fca42897788

crash> struct inode.i_dentry 0xffff9fca42897788
    i_dentry = {
      first = 0xffff9fc85a8cc8f0
    },

crash> px (0xffff9fc85a8cc8f0-0xb0)
$10 = 0xffff9fc85a8cc840

o struct dentry * == 0xffff9fc85a8cc840

crash> files -d 0xffff9fc85a8cc840
     DENTRY           INODE           SUPERBLK     TYPE PATH
ffff9fc85a8cc840 ffff9fca42897788 ffff9fc84bf55800 REG  /informix/data/data3dbs_1

crash> struct inode.i_rwsem 0xffff9fca42897788 -ox
struct inode {
  [ffff9fca42897830] struct rw_semaphore i_rwsem;
}

o struct rw_semaphore * == 0xffff9fca42897830

crash> struct rw_semaphore.wait_list 0xffff9fca42897830 -ox
struct rw_semaphore {
  [ffff9fca42897838] struct list_head wait_list;
}
  • The waitlist of the rw_semaphore has 6 tasks.
crash> list -H 0xffff9fca42897838
ffffb2fd066abd90
ffffb2fd065dfd90
ffffb2fd194efd90
ffffb2fd17a47d90
ffffb2fd1b2dfd90
ffffb2fd13bd7d60
  • List of the task waiting for the rw_semaphore:
crash> list -H 0xffff9fca42897838 -s rwsem_waiter.task | grep task | tr -d , | awk '{print $3}' > rw_task

crash> ps -m < rw_task
[0 00:02:35.467] [UN]  PID: 3613198  TASK: ffff9fdee0a50000  CPU: 4    COMMAND: "kworker/4:5"
[0 00:02:35.467] [UN]  PID: 3613196  TASK: ffff9fe291838000  CPU: 4    COMMAND: "kworker/4:3"
[0 00:02:35.467] [UN]  PID: 3606261  TASK: ffff9fd99a8aa800  CPU: 4    COMMAND: "kworker/4:7"
[0 00:02:35.467] [UN]  PID: 3609395  TASK: ffff9fe68de85000  CPU: 4    COMMAND: "kworker/4:1"
[0 00:02:35.467] [UN]  PID: 3612070  TASK: ffff9fd2ae145000  CPU: 4    COMMAND: "kworker/4:0"
[0 00:02:35.436] [UN]  PID: 4585     TASK: ffff9fca0d9ea800  CPU: 4    COMMAND: "dio/dm-8"

crash> list -H 0xffff9fca42897838 -o rwsem_waiter.list -s rwsem_waiter.type | grep type | awk '{print $3}' | sort | uniq -c
      6 RWSEM_WAITING_FOR_WRITE,
  • The rw_semaphore (inode->i_rwsem) was held by the oninit task PID: 4327
crash> px ((struct rw_semaphore *)0xffff9fca42897830)->owner
$11 = {
  counter = 0xffff9fc84c518000
}

o struct task_struct * == 0xffff9fc84c518000

crash> task_struct.pid 0xffff9fc84c518000
  pid = 4327,

crash> ps 4327
      PID    PPID  CPU       TASK        ST  %MEM      VSZ      RSS  COMM
     4327    4300   9  ffff9fc84c518000  UN   0.0 29594148    25704  oninit

crash> set 4327
    PID: 4327
COMMAND: "oninit"
   TASK: ffff9fc84c518000  [THREAD_INFO: ffff9fc84c518000]
    CPU: 9
  STATE: TASK_UNINTERRUPTIBLE 
  • It has acquired the rw_semaphore in the ext4_file_write_iter() function.
crash> bt
PID: 4327     TASK: ffff9fc84c518000  CPU: 9    COMMAND: "oninit"
 #0 [ffffb2fd13a77bf0] __schedule at ffffffffbb9eb121
 #1 [ffffb2fd13a77c80] schedule at ffffffffbb9eb6e5
 #2 [ffffb2fd13a77c90] ext4_unwritten_wait at ffffffffc08b7317 [ext4]
 #3 [ffffb2fd13a77cd8] ext4_file_write_iter at ffffffffc08b7b46 [ext4]
 #4 [ffffb2fd13a77d38] aio_write at ffffffffbb3b9e16
 #5 [ffffb2fd13a77e40] io_submit_one at ffffffffbb3ba1a1
 #6 [ffffb2fd13a77eb8] __x64_sys_io_submit at ffffffffbb3ba7b2
 #7 [ffffb2fd13a77f38] do_syscall_64 at ffffffffbb0043ab
 #8 [ffffb2fd13a77f50] entry_SYSCALL_64_after_hwframe at ffffffffbba000a9
    RIP: 00007fd1d50409bd  RSP: 00000000531c4da8  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 00007fd1d698ab78  RCX: 00007fd1d50409bd
    RDX: 00000000531c4de8  RSI: 0000000000000001  RDI: 00007fd1c4593000
    RBP: 00007fd1c4593000   R8: 00000001358df800   R9: 000000005667b8d8
    R10: ffffffffffffffe0  R11: 0000000000000246  R12: 0000000000000001
    R13: 0000000000000000  R14: 00000000531c4de8  R15: 000000005667b850
    ORIG_RAX: 00000000000000d1  CS: 0033  SS: 002b
  • The oninit task PID: 4327 and dio/dm-8 thread PID: 4585 were deadlocked.
 

https://access.redhat.com/solutions/7006119

 

 

728x90
728x90

Environment

  • SUSE Linux Enterprise Server 15 SP4
  • SUSE Linux Enterprise Server 15 SP4 for SAP

Situation

  • User is unable to access /dev/raw* devices after system is upgraded to SLES 15 SP4
  • The expected output(before upgrade):
server1:~ # systemctl status raw.service
● raw.service - raw devices
     Loaded: loaded (/usr/lib/systemd/system/raw.service; disabled; vendor preset: disabled)
     Active: active (exited) since Fri 2023-04-21 10:57:06 IST; 2min 18s ago
    Process: 1672 ExecStart=/bin/sh -c /sbin/modprobe raw; /sbin/udevadm settle;   for i in `grep -v ^# /etc/raw`; do rawdev=`echo $i | cut -f>
   Main PID: 1672 (code=exited, status=0/SUCCESS)

Apr 21 10:57:06 server1 systemd[1]: Starting raw devices...
Apr 21 10:57:06 server1 systemd[1]: Finished raw devices.


server1:~ # ls -l /dev/raw*
total 0
crw-rw---- 1 root disk 162, 0 Apr 21 10:57 rawctl
  • After upgrading to SLES15 SP4, raw devices are not accessible:
server1:~ # systemctl status raw.service
Unit raw.service could not be found.

server1:~ # ls -l /dev/raw*
ls: cannot access '/dev/raw*': No such file or directory

 

Resolution

  • From SLES15 SP4 onwards, applications should make use of open device files, such as /dev/sda1, with the O_DIRECT flag.

Cause

  • From SLES15 SP4 onwards, the raw utility has been deprecated
  • Following is the snippet from SLES15 SP4 release notes:
5.2 Basic utilities

5.2.1 util-linux has been updated 
The util-linux package has been updated to version 2.37.2.

The deprecated raw utility has been removed. Applications have to be ported to open(2) device files, such as /dev/sda1, with the O_DIRECT flag.

Additional Information

For more information about the change introduced in SLES15 SP4, please refer to the following: 
https://www.suse.com/releasenotes/x86_64/SUSE-SLES/15-SP4/index.html#basic-utility

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000021050
  • Creation Date:21-Apr-2023
  • Modified Date:21-Apr-2023
    • SUSE Linux Enterprise Server
    • SUSE Linux Enterprise Server for SAP Applications

 

https://www.suse.com/ko-kr/support/kb/doc/?id=000021050

728x90
728x90

Environment

  • Red Hat Enterprise Linux 7 (7.7 and newer)
  • kernel-3.10.0-1062.el7 and newer.
  • Oracle Redo Logs on XFS file systems.
  • Redo Log using direct AIO via io_submit().

Issue

  • Customers who use file systems for the Oracle Redo Log, and in this specific case XFS have seen performance regressions when moving from 7.2 to 7.8.
  • Very slow io_submit syscalls lead to poor Oracle DB performance when the Oracle log writer process uses XFS filesystems, without any visible delays in iostat on the disks used by the filesystem.

Resolution

  • Moving the Redo Log destination on to EXT4 is the easiest way to resolve this for now.
  • If XFS is required, then the Redo Log block size needs to be at least equal to the filesystem blocksize. This can mean either increasing the Redo Log block size to 4k , to match the default XFS blocksize, or rebuilding the XFS filesystem with a 512b block size.
    • For instructions on changing the Redo Log block size you will need to review the database manuals and consult the database vendor (Oracle).
    • Rebuilding the XFS filesystem with a 512b blocksize can be done with the mkfs.xfs options -b size=512 -m crc=0
  • The commit introducing the delays addressed possible corruption related issues. There is no planned change in the XFS code of Red Hat Enterprise Linux 7 to improve the performance of unaligned dio writes.

Root Cause

Based on the stack traces collected while the issue was appearing, the delay is caused because of waiting in inode_dio_wait caused by the submission of direct I/O write requests that were not aligned to filesystem blocks. In such a scenario, the following upstream patch, serializes requests (more thoroughly than before) in order to avoid data corruption.

This patch has been backported to Red Hat Enterprise Linux 7, in kernel-3.10.0-1062.el7, with commit:

ce7cfcb2740 [fs] xfs: serialize unaligned dio writes against all other dio writes

XFS applies more strict serialization constraints to unaligned
direct writes to accommodate things like direct I/O layer zeroing,
unwritten extent conversion, etc. Unaligned submissions acquire the
exclusive iolock and wait for in-flight dio to complete to ensure
multiple submissions do not race on the same block and cause data
corruption.

This generally works in the case of an aligned dio followed by an
unaligned dio, but the serialization is lost if I/Os occur in the
opposite order. If an unaligned write is submitted first and
immediately followed by an overlapping, aligned write, the latter
submits without the typical unaligned serialization barriers because
there is no indication of an unaligned dio still in-flight. This can
lead to unpredictable results.

To provide proper unaligned dio serialization, require that such
direct writes are always the only dio allowed in-flight at one time
for a particular inode. We already acquire the exclusive iolock and
drain pending dio before submitting the unaligned dio. Wait once
more after the dio submission to hold the iolock across the I/O and
prevent further submissions until the unaligned I/O completes. This
is heavy handed, but consistent with the current pre-submission
serialization for unaligned direct writes.

Diagnostic Steps

Disclaimer: Links contained herein to external website(s) are provided for convenience only. Red Hat has not reviewed the links and is not responsible for the content or its availability. The inclusion of any link to an external website does not imply endorsement by Red Hat of the website or their entities, products or services. You agree that Red Hat is not responsible or liable for any loss or expenses that may result due to your use of (or reliance on) the external site or content.

  • Database logs frequently report long delays from the Redo Log writer, similar to the following:
  • Warning: log write elapsed time 158ms, size 35041KB
    
  • Collecting strace from the Redo Log writer process reveals io_submit calls taking long to complete, matching the logs from the database. Strace options -yy -T -tt -f provide useful information on the files being accessed and on timing. Strace will need to be attached to the Redo Log writer process with the -p option.
    In the output of strace, offsets and bytes written in each io_submit call can be seen. These are not multiples of the XFS filesystem block size (writes are unaligned to the filesystem block size and because of commit ce7cfcb2740 serialized to avoid data corruption).
    A sample of such an io_submit call as captured by strace is the following:This sample is from an XFS filesystem with the default 4k blocksize. We can observe that the offsets are not multiples of 4096 - therefore IO is not aligned to filesystem blocks.
  • 347496 13:39:59.853377 io_submit(140055665958912, 66, [
        {data=..., pwrite, fildes=262, str="..."..., nbytes=1048576, offset=5254478848},
        {data=..., pwrite, fildes=262, str="..."..., nbytes=285696, offset=5255527424},
        {data=..., pwrite, fildes=262, str="..."..., nbytes=1048576, offset=5255813120},
        {data=..., pwrite, fildes=262, str="..."..., nbytes=195072, offset=5256861696},
        {data=..., pwrite, fildes=262, str="..."..., nbytes=1048576, offset=5257056768},
        {data=..., pwrite, fildes=262, str="..."..., nbytes=843264, offset=5258105344},
        {data=..., pwrite, fildes=262, str="..."..., nbytes=1048576, offset=5258948608},
        {data=..., pwrite, fildes=262, str="..."..., nbytes=1048576, offset=5259997184},
        {data=..., pwrite, fildes=262, str="..."..., nbytes=407040, offset=5261045760},
        ...
    ...]) = 66 <0.120894>
    
  • Using the io_submit.stp systemtap script (attached at the bottom of the article), capture the slow io_submit and look for the dio_wait calls. In order to use the io_submit.stp script, the system needs to be prepared according to: What is SystemTap and how to use it?
    • Save the attached script as io_submit.stp. A copy of the script is also included in the systemtap-client package and can be found in /usr/share/systemtap/examples/io/io_submit.stp, once the package is installed. It can also be found in the upstream pages of systemtap
    • Compile and run the script using stap -g -d xfs -d kernel io_submit.stp
    • Run long enough to capture a few events.
    • Interrupt with CTRL+c. The script will only output data after stopping with CTRL+c.
    Example output from the io_submit.stp script. This output reveals that 218943+7630 io_submit calls have been waiting in inode_dio_wait called from xfs_file_dio_aio_write. This is the vast majority of calls. This code path matches the changes introduced by commit ce7cfcb2740 [fs] xfs: serialize unaligned dio writes against all other dio writes:
  • WARNING: Missing unwind data for a module, rerun with 'stap -d (unknown; retry with -DDEBUG_UNWIND)'
    WARNING: too many pending (warning) messages
    Ready!
    WARNING: Missing unwind data for a module, rerun with 'stap -d kernel'
    ^C218943: 0xffffffffae185d60 : schedule+0x0/0x70 [kernel]
     0xffffffffadc6ab49 : inode_dio_wait+0xd9/0x100 [kernel]
     0xffffffffc070c698 : xfs_file_dio_aio_write+0x2a8/0x390 [xfs]
     0xffffffffc070cb42 : xfs_file_aio_write+0x102/0x1b0 [xfs]
     0xffffffffadca4e13 : do_io_submit+0x3e3/0x8a0 [kernel]
     0xffffffffadca52e0 : sys_io_submit+0x10/0x20 [kernel]
     0xffffffffae18b459 : kretprobe_trampoline+0x0/0x57 [kernel]
     0x7fffffff7ef0 : 0x7fffffff7ef0
    7630: 0xffffffffae185d60 : schedule+0x0/0x70 [kernel]
     0xffffffffadc6ab49 : inode_dio_wait+0xd9/0x100 [kernel]
     0xffffffffc070c710 : xfs_file_dio_aio_write+0x320/0x390 [xfs]
     0xffffffffc070cb42 : xfs_file_aio_write+0x102/0x1b0 [xfs]
     0xffffffffadca4e13 : do_io_submit+0x3e3/0x8a0 [kernel]
     0xffffffffadca52e0 : sys_io_submit+0x10/0x20 [kernel]
     0xffffffffae18b459 : kretprobe_trampoline+0x0/0x57 [kernel]
     0x7fffffff7ef0 : 0x7fffffff7ef0
    511: 0xffffffffae185d60 : schedule+0x0/0x70 [kernel]
     0xffffffffae1874c5 : rwsem_down_write_failed+0x215/0x3c0 [kernel]
     0xffffffffadd96677 : call_rwsem_down_write_failed+0x17/0x30 [kernel]
     0xffff9bfd7d3bfd10 : 0xffff9bfd7d3bfd10
     0xffffffffc0719070 : xfs_ilock+0xa0/0x120 [xfs]
     0xffffffffc070c4a7 : xfs_file_dio_aio_write+0xb7/0x390 [xfs]
     0xffffffffc070cb42 : xfs_file_aio_write+0x102/0x1b0 [xfs]
     0xffffffffadca4e13 : do_io_submit+0x3e3/0x8a0 [kernel]
     0xffffffffadca52e0 : sys_io_submit+0x10/0x20 [kernel]
     0xffffffffae18b459 : kretprobe_trampoline+0x0/0x57 [kernel]
    110: 0xffffffffae185d60 : schedule+0x0/0x70 [kernel]
     0xffffffffadc6ab49 : inode_dio_wait+0xd9/0x100 [kernel]
     0xffffffffc070c698 : xfs_file_dio_aio_write+0x2a8/0x390 [xfs]
     0xffffffffc070cb42 : xfs_file_aio_write+0x102/0x1b0 [xfs]
     0xffffffffadca4e13 : do_io_submit+0x3e3/0x8a0 [kernel]
     0xffffffffadca52e0 : sys_io_submit+0x10/0x20 [kernel]
     0xffffffffae18b459 : kretprobe_trampoline+0x0/0x57 [kernel]
     0x7fffffff7790 : 0x7fffffff7790
    9: 0xffffffffae185d60 : schedule+0x0/0x70 [kernel]
     0xffffffffadc6ab49 : inode_dio_wait+0xd9/0x100 [kernel]
     0xffffffffc070c710 : xfs_file_dio_aio_write+0x320/0x390 [xfs]
     0xffffffffc070cb42 : xfs_file_aio_write+0x102/0x1b0 [xfs]
     0xffffffffadca4e13 : do_io_submit+0x3e3/0x8a0 [kernel]
     0xffffffffadca52e0 : sys_io_submit+0x10/0x20 [kernel]
     0xffffffffae18b459 : kretprobe_trampoline+0x0/0x57 [kernel]
     0x7fffffff7790 : 0x7fffffff7790
    5: 0xffffffffae185d60 : schedule+0x0/0x70 [kernel]
     0xffffffffae187775 : rwsem_down_read_failed+0x105/0x1c0 [kernel]
     0xffffffffadd96648 : call_rwsem_down_read_failed+0x18/0x30 [kernel]
     0xffff9bfd7d3bf9e0 : 0xffff9bfd7d3bf9e0
     0xffffffffc07190a9 : xfs_ilock+0xd9/0x120 [xfs]
     0xffffffffc0719120 : xfs_ilock_data_map_shared+0x30/0x40 [xfs]
     0xffffffffc06ff7ea : __xfs_get_blocks+0x9a/0x6d0 [xfs]
     0xffffffffc0700ff4 : xfs_get_blocks_direct+0x14/0x20 [xfs]
     0xffffffffadc917f6 : do_blockdev_direct_IO+0xf96/0x20a0 [kernel]
     0xffffffffadc92955 : __blockdev_direct_IO+0x55/0x60 [kernel]
     0xffffffffc070c570 : xfs_file_dio_aio_write+0x180/0x390 [xfs]
     0xffffffffc070cb42 : xfs_file_aio_write+0x102/0x1b0 [xfs]
     0xffffffffadca4e13 : do_io_submit+0x3e3/0x8a0 [kernel]
     0xffffffffadca52e0 : sys_io_submit+0x10/0x20 [kernel]
     0xffffffffae18b459 : kretprobe_trampoline+0x0/0x57 [kernel]
    2: 0xffffffffae185d60 : schedule+0x0/0x70 [kernel]
     0xffffffffadc6ab49 : inode_dio_wait+0xd9/0x100 [kernel]
     0xffffffffc070c698 : xfs_file_dio_aio_write+0x2a8/0x390 [xfs]
     0xffffffffc070cb42 : xfs_file_aio_write+0x102/0x1b0 [xfs]
     0xffffffffadca4e13 : do_io_submit+0x3e3/0x8a0 [kernel]
     0xffffffffadca52e0 : sys_io_submit+0x10/0x20 [kernel]
     0xffffffffae18b459 : kretprobe_trampoline+0x0/0x57 [kernel]
     0x7fffffff7da0 : 0x7fffffff7da0
    1: 0xffffffffae185d60 : schedule+0x0/0x70 [kernel]
     0xffffffffadc6ab49 : inode_dio_wait+0xd9/0x100 [kernel]
     0xffffffffc070c710 : xfs_file_dio_aio_write+0x320/0x390 [xfs]
     0xffffffffc070cb42 : xfs_file_aio_write+0x102/0x1b0 [xfs]
     0xffffffffadca4e13 : do_io_submit+0x3e3/0x8a0 [kernel]
     0xffffffffadca52e0 : sys_io_submit+0x10/0x20 [kernel]
     0xffffffffae18b459 : kretprobe_trampoline+0x0/0x57 [kernel]
     0x7fffffff7da0 : 0x7fffffff7da0
    

 

 

https://access.redhat.com/solutions/5906661

728x90
728x90

APAR status

  • OPEN

Error description

  • KAIO not supported with sector_sizes greater than 1kB
    

 

https://www.ibm.com/support/pages/apar/IT43456

728x90
728x90
 

Question

I found some indexes marked invalid after rollforward, or a HADR standby database become switched to primary.
Why the indexes are marked as invalid, and how I can recover the invalid indexes.

Cause

Db2 does not write index build or rebuild log records by default, rather Db2 marks the index as invalid so that recovery task can trigger to rebuild the index.

Answer

"Index rebuilding" means index creation, index reorg, classic table reorg, and rebuilding index in this article.
 
Indexes can be marked as invalid when:
  • Index rebuilding is not logged ( LOGINDEXBUILD=NO), then index rebuilding replayed.
  • Index rebuilding is logged ( LOGINDEXBUILD=YES), then index rebuilding replayed but INDEXREC is set as ACCESS_NO_REDO or RESTART_NO_REDO.
  • Failed offline reorg, or such failed offline reorg is replayed.
  • db2dart runs with /MI option.

Indexes can be rebuilt when:
  • There are one or more indexes marked invalid when a database restarts. i,e. Run RESTART DB with INDEXREC=RESTART
  • Someone access an index marked as invalid.
  • Run classic REORG TABLE command
  • Run REORG INDEXES or REORG INDEX command
  • Run CREATE INDEX statement
  • Run LOAD command, and it uses REBUILD option as INDEXING MODE
  • Run IMPORT or LOAD command with REPLACE option
  • Run TRAUNCATE TABLE statement

We can list all the indexes marked as invalid by the following query:
db2 "select tabschema, tabname, indname, datapartitionid, index_partitioning from table(sysproc.admin_get_index_info('','','')) as t where index_requires_rebuild='Y'"

We can recover the indexes marked as invalid by ether of the following ways.
 
a) Run RESTART DATABASE command (in case db cfg INDEXREC is set as RESTART)
The restart database command rebuild all the indexes set as INDEX_REQUIRES_REBUILD=Y
db2 restart database <database_name>
b) Run reorg indexes
db2 reorg indexes all on table <table_name> allow no access

Note: 
You can set LOGINDEXBUILD as YES to prevent indexes to be marked as invalid.
So, this setting is highly recommended in HADR databases.
Example:
db2 connect to <database_name>
db2 update db cfg using LOGINDEXBUILD YES

 

https://www.ibm.com/support/pages/db2-causes-mark-index-invalid-and-how-recover-indexes

 

[Db2] The causes to mark index invalid, and how to recover the indexes.

[Db2] The causes to mark index invalid, and how to recover the indexes.

www.ibm.com

 

728x90
728x90

Problem description:

Oracle 11g utilities like tnsping, lsnrctl or sqlplus on AIX 6.1 hanged for about 2,5 minutes (150 seconds) before connection with alias to any destination when DNS server was unreachable. In the same conditions (host and configuration) 10g utilities connected without delay. Oracle*Net was configured for local naming method only – NAMES.DIRECTORY_PATH=(tnsnames) in the both 10g and 11g Oracle Home. The tnsnames.ora files where identical in the 10g and 11g Oracle Home. The aliases in the tnsnames.ora used host names instead of IP addresses in the connection descriptions; if a host name was substituted by IP address the delay disappeared for this alias. Nevertheless all the host names from tnsnames where described in the /etc/hosts and resolution order was hosts = local, bind in the netsvc.conf (analog of  the nsswitch.conf in other UNIXs). This configuration should guarantee local name resolution when DNS server was unreachable. What is the root reason of this delay in 11g Oracle*Net ?

 

Diagnostic:

First of all I checked configuration of the Oracle*Net – sqlnet.ora, tnsnames.ora and AIX configuration of network name resolution  – /etc/resolv.conf, /etc/netsvc.conf. The configuration was typical.

 

The next steps were Oracle*Net and OS level tracing both 11g and 10g tnsping utility when DNS was reachable and unreachable. To emulate unreachable DNS server it’s IP address was changed in the /etc/resolv.conf

The Oracle*Net Support level tracing was enabled:

TNSPING.TRACE_LEVEL = 16
TRACE_LEVEL_CLIENT = 16

The comparison of 11g Oracle*Net traces gave no clue. Traces with and without reachable DNS server were almost identical. The comparison of 10g and 11g Oracle*Net traces did not help too. There were some differences but generally they were similar.

To trace tnsping at OS level were used AIX truss utility (truss -aefo tnsping.trc tnsping db_test). I checked the traces but din’t found any system call (like gethostbyname, gethostbyname2, gethostbyaddr or getaddrinfo) which could use DNS for the name resolution.

Comparing 11g with 10g trace files I found some new actions in the 11g trace. Here are some parts of tracing tnsping (11g and 10g) when DNS server was unreachable:

10g part:

kopen(“/etc/resolv.conf”, O_RDONLY)    = 4
kioctl(4, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY
kioctl(4, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY
kread(4, ” n a m e s e r v e r\t 1″.., 4096) = 38
kread(4, ” n a m e s e r v e r\t 1″.., 4096) = 0
statx(“/etc/resolv.conf”, 0x0FFFFFFFFFFFA2D8, 176, 0) = 0
close(4)                = 0
socket(1, 1, 0)            = 4
kfcntl(4, F_SETFD, 0x0000000000000001) = 0
connext(4, 0x0FFFFFFFFFFF9DA8, 1025)    Err#2  ENOENT
close(4)                = 0
kopen(“/etc/netsvc.conf”, O_RDONLY)    = 4
kioctl(4, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY
kioctl(4, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY
kread(4, ” #   @ ( # ) 4 3        “.., 4096) = 4096
kread(4, ” o n   a n d   r e s o l”.., 4096) = 663
close(4)                = 0
_thread_self()            = 33816773
__libc_sbrk(0x0000000000010020)    = 0x00000001109A2AA0
_thread_self()            = 33816773
kopen(“/etc/hesiod.conf”, O_RDONLY)    Err#2  ENOENT
kopen(“/etc/irs.conf”, O_RDONLY)    Err#2  ENOENT
_thread_self()            = 33816773
getdomainname(0x09001000A00DB290, 1024) = 0
_thread_self()            = 33816773
_thread_self()            = 33816773
_thread_self()            = 33816773
getdomainname(0x09001000A00DB290, 1024) = 0
_thread_self()            = 33816773
_thread_self()            = 33816773
_thread_self()            = 33816773
_thread_self()            = 33816773
_thread_self()            = 33816773
_thread_self()            = 33816773
_thread_self()            = 33816773
getdomainname(0x09001000A00DB290, 1024) = 0
_thread_self()            = 33816773
_thread_self()            = 33816773
_thread_self()            = 33816773
getdomainname(0x09001000A00DB290, 1024) = 0
_thread_self()            = 33816773
_thread_self()            = 33816773
__libc_sbrk(0x0000000000010020)    = 0x00000001109B2AC0
kopen(“/etc/hosts”, O_RDONLY)        = 4
kioctl(4, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY
kfcntl(4, F_SETFD, 0x0000000000000001) = 0
kioctl(4, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY
kread(4, ” #   @ ( # ) 4 7\t 1 . 2″.., 4096) = 2029
kread(4, ” #   @ ( # ) 4 7\t 1 . 2″.., 4096) = 0
close(4)                = 0
socket(2, 1, 0)            = 4
getsockopt(4, 65535, 4104, 0x0FFFFFFFFFFFC124, 0x0FFFFFFFFFFFC120) = 0
connext(4, 0x00000001109B9E30, 16)    = 0
ngetsockname(4, 0x0FFFFFFFFFFFC1D4, 0x0FFFFFFFFFFFC1D0) = 0
getsockopt(4, 65535, 4097, 0x0FFFFFFFFFFFC324, 0x0FFFFFFFFFFFC320) = 0
getsockopt(4, 65535, 4098, 0x0FFFFFFFFFFFC324, 0x0FFFFFFFFFFFC320) = 0
setsockopt(4, 6, 1, 0x0FFFFFFFFFFFC32C, 4) = 0
kfcntl(4, F_SETFD, 0x0000000000000001) = 0
sigprocmask(2, 0x09001000A01BAAC0, 0x0FFFFFFFFFFFC330) = 0
_sigaction(13, 0x0FFFFFFFFFFFC460, 0x0FFFFFFFFFFFC4B8) = 0
thread_setmystate(0x0FFFFFFFFFFFBF20, 0x0000000000000000) = 0
kwrite(4, ” W0101 901 ,”.., 87) = 87
kread(4, ” A04 ” 5″.., 2064) = 65
close(4)                = 0
lseek(3, 27136, 0)            = 27136
kread(3, “\r\r ® V\r ¯”.., 512) = 512
kwrite(1, ” O K   ( 1 0   m s e c )”.., 13) = 13
kfcntl(1, F_GETFL, 0x0000000000000008) = 67110914
kfcntl(1, F_GETFL, 0x0000000000000008) = 67110914
close(3)                = 0
kfcntl(1, F_GETFL, 0x00000000120400C5) = 67110914
kfcntl(2, F_GETFL, 0x00000000120400C5) = 67110914
_exit(0)

11g part:

kopen(“/etc/resolv.conf”, O_RDONLY)    = 4
kioctl(4, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY
kioctl(4, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY
kread(4, ” n a m e s e r v e r\t 1″.., 4096) = 38
kread(4, ” n a m e s e r v e r\t 1″.., 4096) = 0
statx(“/etc/resolv.conf”, 0x0FFFFFFFFFFF7B08, 176, 0) = 0
close(4)                = 0
socket(1, 1, 0)            = 4
kfcntl(4, F_SETFD, 0x0000000000000001) = 0
connext(4, 0x0FFFFFFFFFFF75D8, 1025)    Err#2  ENOENT
close(4)                = 0
kopen(“/etc/netsvc.conf”, O_RDONLY)    = 4
kioctl(4, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY
kioctl(4, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY
kread(4, ” #   @ ( # ) 4 3        “.., 4096) = 4096
kread(4, ” o n   a n d   r e s o l”.., 4096) = 663
close(4)                = 0
_thread_self()            = 22020181
_thread_self()            = 22020181
kopen(“/etc/hesiod.conf”, O_RDONLY)    Err#2  ENOENT
kopen(“/etc/irs.conf”, O_RDONLY)    Err#2  ENOENT
_thread_self()            = 22020181
getdomainname(0x09001000A00DB290, 1024) = 0
_thread_self()            = 22020181
_thread_self()            = 22020181
_thread_self()            = 22020181
getdomainname(0x09001000A00DB290, 1024) = 0
_thread_self()            = 22020181
_thread_self()            = 22020181
_thread_self()            = 22020181
_thread_self()            = 22020181
_thread_self()            = 22020181
_thread_self()            = 22020181
_thread_self()            = 22020181
getdomainname(0x09001000A00DB290, 1024) = 0
_thread_self()            = 22020181
_thread_self()            = 22020181
_thread_self()            = 22020181
getdomainname(0x09001000A00DB290, 1024) = 0
_thread_self()            = 22020181
_thread_self()            = 22020181
__libc_sbrk(0x0000000000010020)    = 0x0000000110A699A0
kopen(“/etc/hosts”, O_RDONLY)        = 4
kioctl(4, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY
kfcntl(4, F_SETFD, 0x0000000000000001) = 0
kioctl(4, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY
kread(4, ” #   @ ( # ) 4 7\t 1 . 2″.., 4096) = 2029
kread(4, ” #   @ ( # ) 4 7\t 1 . 2″.., 4096) = 0
close(4)                = 0
 kopen(“/etc/hosts”, O_RDONLY)        = 4
 kioctl(4, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY
 kfcntl(4, F_SETFD, 0x0000000000000001) = 0
 kioctl(4, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY
 kread(4, ” #   @ ( # ) 4 7\t 1 . 2″.., 4096) = 2029
 kread(4, ” #   @ ( # ) 4 7\t 1 . 2″.., 4096) = 0
 socket(2, 2, 0)            = 5
getsockopt(5, 65535, 4104, 0x0FFFFFFFFFFF65E4, 0x0FFFFFFFFFFF65E0) = 0
connext(5, 0x09001000A0024658, 16)    = 0
_esend(5, 0x0FFFFFFFFFFF74F0, 35, 0, 0x0000000000000000) = 35
_poll(0x0FFFFFFFFFFF66B0, 1, 5000) (sleeping…)
_poll(0x0FFFFFFFFFFF66B0, 1, 5000)    = 0
close(5)                = 0
 socket(2, 2, 0)            = 5
getsockopt(5, 65535, 4104, 0x0FFFFFFFFFFF65E4, 0x0FFFFFFFFFFF65E0) = 0
connext(5, 0x09001000A0024658, 16)    = 0
_esend(5, 0x0FFFFFFFFFFF74F0, 35, 0, 0x0000000000000000) = 35
_poll(0x0FFFFFFFFFFF66B0, 1, 10000) (sleeping…)
_poll(0x0FFFFFFFFFFF66B0, 1, 10000)    = 0
close(5)                = 0
socket(2, 2, 0)            = 5
getsockopt(5, 65535, 4104, 0x0FFFFFFFFFFF65E4, 0x0FFFFFFFFFFF65E0) = 0
connext(5, 0x09001000A0024658, 16)    = 0
_esend(5, 0x0FFFFFFFFFFF74F0, 35, 0, 0x0000000000000000) = 35
_poll(0x0FFFFFFFFFFF66B0, 1, 20000) (sleeping…)
_poll(0x0FFFFFFFFFFF66B0, 1, 20000)    = 0
close(5)                = 0
socket(2, 2, 0)            = 5
getsockopt(5, 65535, 4104, 0x0FFFFFFFFFFF65E4, 0x0FFFFFFFFFFF65E0) = 0
connext(5, 0x09001000A0024658, 16)    = 0
_esend(5, 0x0FFFFFFFFFFF74F0, 35, 0, 0x0000000000000000) = 35
_poll(0x0FFFFFFFFFFF66B0, 1, 40000) (sleeping…)
_poll(0x0FFFFFFFFFFF66B0, 1, 40000)    = 0
close(5)                = 0
 socket(2, 2, 0)            = 5
getsockopt(5, 65535, 4104, 0x0FFFFFFFFFFF65E4, 0x0FFFFFFFFFFF65E0) = 0
connext(5, 0x09001000A0024658, 16)    = 0
_esend(5, 0x0FFFFFFFFFFF74F0, 27, 0, 0x0000000000000000) = 27
_poll(0x0FFFFFFFFFFF66B0, 1, 5000) (sleeping…)
_poll(0x0FFFFFFFFFFF66B0, 1, 5000)    = 0
close(5)                = 0
socket(2, 2, 0)            = 5
getsockopt(5, 65535, 4104, 0x0FFFFFFFFFFF65E4, 0x0FFFFFFFFFFF65E0) = 0
connext(5, 0x09001000A0024658, 16)    = 0
_esend(5, 0x0FFFFFFFFFFF74F0, 27, 0, 0x0000000000000000) = 27
_poll(0x0FFFFFFFFFFF66B0, 1, 10000) (sleeping…)
_poll(0x0FFFFFFFFFFF66B0, 1, 10000)    = 0
close(5)                = 0
socket(2, 2, 0)            = 5
getsockopt(5, 65535, 4104, 0x0FFFFFFFFFFF65E4, 0x0FFFFFFFFFFF65E0) = 0
connext(5, 0x09001000A0024658, 16)    = 0
_esend(5, 0x0FFFFFFFFFFF74F0, 27, 0, 0x0000000000000000) = 27
_poll(0x0FFFFFFFFFFF66B0, 1, 20000) (sleeping…)
_poll(0x0FFFFFFFFFFF66B0, 1, 20000)    = 0
close(5)                = 0
socket(2, 2, 0)            = 5
getsockopt(5, 65535, 4104, 0x0FFFFFFFFFFF65E4, 0x0FFFFFFFFFFF65E0) = 0
connext(5, 0x09001000A0024658, 16)    = 0
_esend(5, 0x0FFFFFFFFFFF74F0, 27, 0, 0x0000000000000000) = 27
_poll(0x0FFFFFFFFFFF66B0, 1, 40000) (sleeping…)
_poll(0x0FFFFFFFFFFF66B0, 1, 40000)    = 0
 close(5)                = 0
close(4)                = 0
socket(2, 1, 0)            = 4
kfcntl(4, F_GETFL, 0x0000000000000000) = 2
kfcntl(4, F_SETFL, 0x0000000000000006) = 0
kioctl(4, -2147195266, 0x0FFFFFFFFFFF89C8, 0x0000000000000000) = 0
getsockopt(4, 65535, 4104, 0x0FFFFFFFFFFF9594, 0x0FFFFFFFFFFF9590) = 0
connext(4, 0x0000000110A76590, 16)    = 0
kfcntl(4, F_GETFL, 0x0000000000000000) = 6
kfcntl(4, F_SETFL, 0x0000000000000002) = 0
kioctl(4, -2147195266, 0x0FFFFFFFFFFF89C8, 0x0000000000000000) = 0
ngetsockname(4, 0x0FFFFFFFFFFF9FD8, 0x0FFFFFFFFFFF9680) = 0
getsockopt(4, 65535, 4097, 0x0FFFFFFFFFFFA634, 0x0FFFFFFFFFFFA630) = 0
getsockopt(4, 65535, 4098, 0x0FFFFFFFFFFFA634, 0x0FFFFFFFFFFFA630) = 0
setsockopt(4, 6, 1, 0x0FFFFFFFFFFFA63C, 4) = 0
kfcntl(4, F_SETFD, 0x0000000000000001) = 0
sigprocmask(2, 0x09001000A01BAAC0, 0x0FFFFFFFFFFFB0C0) = 0
_sigaction(13, 0x0FFFFFFFFFFFB1F0, 0x0FFFFFFFFFFFB248) = 0
thread_setmystate(0x0FFFFFFFFFFFACB0, 0x0000000000000000) = 0
__libc_sbrk(0x0000000000010020)    = 0x0000000110A799C0
kwrite(4, ” W0101 :01 ,”.., 87) = 87
kread(4, ” A04 ” 5″.., 8208) = 65
close(4)                = 0
lseek(3, 27136, 0)            = 27136
kread(3, “0E\t Ý \\t Þ”.., 512) = 512
kwrite(1, ” O K   ( 1 5 0 0 2 0   m”.., 17) = 17
kfcntl(1, F_GETFL, 0x0000000000000008) = 67110914
kfcntl(1, F_GETFL, 0x0000000000000008) = 67110914
close(3)                = 0
kfcntl(1, F_GETFL, 0x0000000011500055) = 67110914
kfcntl(2, F_GETFL, 0x0000000011500055) = 67110914
_exit(0)

We can see some new actions in the 11g part:

  • /etc/hosts look up were done 2 times; after the second lookup the file wasn’t close;
  • socket(2, 2, 0) was opened and closed 8 times (2 cycles by 4 times);
  • every time through this socket was sent a message (_esend) and status of the message queue was checked (_poll);
  • every time the checking of this message queue was waiting for response and finally timed out;
  • the waiting time was doubled (5000, 10000, 20000, 40000 ms) in the every next try in the cycle;
  • if we sum all the waiting time (5+10+20+40)*2 = 150 seconds !

The question is why all this new actions were done?

 

The reason

The answer is IPv6. It’s support was included in 11g Oracle*Net.

Even when the host name was resolved in IPv4 successfuly, the second look up in /etc/hosts is needed for IPv6 address host name resolution. If IPv6 address was not found in /etc/hosts then Oracle*Net requests DNS server for IPv6 name resolution because name resolution order is local, bind configured in the /etc/netsvc.conf.

 

Solution

To resolve the problem we should replace local, bind by local4, bind4  in the /etc/netsvc.conf. The meaning of the local4 and bind4 is explained in the comments in /etc/netsvc.conf:

# bind4       Uses BIND/DNS services for resolving only IPv4 addresses
# bind6       Uses BIND/DNS services for resolving only IPv6 addresses
# local4      Searches the local /etc/hosts file for resolving only IPv4 addresses
# local6      Searches the local /etc/hosts file for resolving only IPv6 addresses

 

https://odenysenko.wordpress.com/2012/02/11/investigation-of-the-11g-oraclenet-connection-delay-when-dns-server-is-unreachable/#more-482

 

728x90
728x90

Troubleshooting


Problem

Connections to Informix Dynamic Server (IDS) hang or take a long time to connect. Other users who are already connected do not see a performance problem. Once a user is connected there is no performance problem.

Symptom

Connections to Informix Dynamic Server (IDS) hang or take a long time to connect. Other users who are already connected do not see a performance problem. Once a user is connected there is no performance problem. Users connecting by way of shared memory do not experience a connection hang. Only users using network connections see this problem.

Cause


There are not enough files descriptors allocated for IDS to effectively handle the number of network connections. See the Related information section for a link to more information regarding file descriptors.

Note: This is only one possible cause.

Resolving The Problem


1. Increase the number of file descriptors available to a process. Ask your operating system administrator to increase the number of file descriptors .The suggested are values 2048, 4096, or unlimited. You can temporarily increase the number of file descriptors by running the ulimit -n command:


2. Restart IDS after the file descriptor increase is implemented.

Note: If this does not resolve the problem then look for the Collecting Data document that will help you collect the correct troubleshooting data for a hang. After you collect this data contact technical support.

Also look at this document; Slow connect time to Informix when using TCP/IP

 

https://www.ibm.com/support/pages/connections-ids-hang

 

Connections to IDS hang

 

www.ibm.com

 

728x90
728x90

Troubleshooting


Problem

Client applications using the TCP/IP protocol to connect to Informix experience poor performance during connection. Once the connection has been established the performance is improved.

Cause

  • During connection establishment, the IP address of the client is used by the IDS server to look up its host name. The method used to accomplish this is dependent on the configuration of the operating system. If the Domain Name System (DNS) is used, an operation known as a reverse query is used to map the IP address to host name. If the configuration of the DNS server is incomplete this type of query may take some time.

    The command line utility nslookup may be used to perform DNS queries. If a reverse lookup operation takes a long time to complete using nslookup then the IDS server will also take a long time to complete the same operation.

    The first example shows a reverse lookup operation for an IP address that is known to the DNS server and completes immediately. The second example uses an IP address that is unknown taking several seconds before returning an error.

    C:\>nslookup 1.2.3.4
    Server:  host1.domain.com
    Address:  9.8.7.6

    Name:    host222.domain.com
    Address:  1.2.3.4

    C:\>nslookup 9.69.1.126
    Server:  host1.domain.com
    Address:  9.8.7.6

    DNS request timed out.
    timeout was 2 seconds.
    *** Request to host1.domain.com timed-out


  • The lookup for the service name could be taking a while.
  • Another cause may by the use of IPv6 addresses with DNS servers that are not configured for IPv6.

Diagnosing The Problem

To isolate the problem to tcp/ip connections try to connect by way of a shared memory connection. If the connection is now completed quickly then the problem outlined above may be the cause of poor connection performance.

Follow this link if you need instructions for setting up shared memory connections:


A shared-memory connection (UNIX)

Resolving The Problem

  • Refer to your operating system documentation for details of how to configure DNS. If the DNS server runs on the Microsoft Windows® platform then it is possible that the zone used to implement reverse lookup is not implemented. Details regarding DNS on the Windows operating system can be found in the related URLs below. Ensure that the reverse-lookup zone is populated for all clients that connect to the IDS server.
  • Ensure that local hostname resolution is performed before other methods of hostname resolution (NIS, NIS+, DNS, and so on). Files like these are used on various operating systems to determine lookup order:

    /etc/netsrv.conf
    /etc/irs.conf
    /etc/nsswitch.conf
  • Reconfigure your operating system to not use DNS, for example by including the host names and IP addresses of all client machines in the local hosts file /etc/hosts or %WINDIR%\system32\drivers\etc\hosts on Windows.
  • Change the host name in the sqlhosts file to an IP address.
  • Change the service name in the sqlhosts file to the port number.
  • Connections to IDS hang

 

https://www.ibm.com/support/pages/slow-connect-time-informix-when-using-tcpip

  •  
 

Slow connect time to Informix when using TCP/IP

Slow connect time to Informix when using TCP/IP

www.ibm.com

 

728x90
728x90

APAR status

  • Closed as program error.

Error description

  • The listener thread may receive an OS error when trying to bind
    the port with an IPv6 address. This results in an AF similar to:
    
    Assert Warning: Unable to bind to the port (18105) on the host
    () for the server (server1).
    
    Stack for thread: 11 soctcplst
    
    (oninit) afstack
    (oninit) afhandler
    (oninit) afwarn_interface
    (oninit) listen_af_alarm
    (oninit) lisnsocket
    (oninit) tlListen
    (oninit) slSQIlisten
    (oninit) pfListen
    (oninit) cmListen
    (oninit) alListen
    (oninit) ASF_Call
    (oninit) sql_listener
    
    This may occur when using a specific IPv6 address other than the
    loopback address ::1.
    
    This problem may occur on releases of Linux where the option
    "myhostname" is permitted for the hosts entry in
    /etc/nsswitch.conf. The presence of this entry avoids the need
    to enter the host's specific IP address into /etc/hosts and will
    result in an IPv6 address being used by the listener thread if
    IPv6 has been configured for at least one network interface. Red
    Hat 7 is one such OS version where this option is available.
    

Local fix

  • The workaround is to allow the bind operation to use any of the
    available IPv6 addresses on the server by prefixing the hostname
    or address in column 3 of the sqlhosts file with an asterix "*".
    Alternatively if IPv6 is not required it may be disabled by
    setting the environment variable IFX_DISABLE_IPV6 or creating
    the empty file $INFORMIXDIR/etc/IFX_DISABLE_IPV6.
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * informix user with IPv6 enabled                              *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See Error Description                                        *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Update to IBM Informix Server 12.10.xC6                      *
    ****************************************************************
    

Problem conclusion

  • Problem Fixed In IBM Informix Server 12.10.xC6
    

Temporary fix

  •  

Comments

  •  

APAR Information

  • APAR number
  • IT11530
  • Reported component name
  • INFORMIX SERVER
  • Reported component ID
  • 5725A3900
  • Reported release
  • C10
  • Status
  • CLOSED PER
  • PE
  • NoPE
  • HIPER
  • NoHIPER
  • Special Attention
  • NoSpecatt
  • Submitted date
  • 2015-10-01
  • Closed date
  • 2015-12-30
  • Last modified date
  • 2015-12-30
728x90
728x90

Troubleshooting


Problem

A problem with DNS lookup can cause slow connection to database server. This article explains one scenario and a possible solution for that problem.

Symptom

Connection request to database server from dbaccess is trying to access DNS server for hostname resolution, after it found the same in the local host file.

On certain Operating System (for example: Linux) you may noticed dbaccess (and other client applications) always doing DNS lookup while connecting to a database, even after found out the hostname or IP address in the local host file. This behavior sometimes caused slow connection, if you have problem related to DNS. Following is an excerpt of strace output shows the sequence of file accessed by a dbaccess request:

Cause

Traditionally, hostname and service name resolution were performed by functions such as gethostbyname(), getservbyname() etc. These traditional lookup functions are still available, however those are not forward compatible to IPv6. Instead, the IPv6 socket API provides new lookup functions that consolidate the functionality of several traditional functions. These new lookup functions are also backward compatible with IPv4, so a programmer can use the same translation algorithm in an application for both the IPv4 and Ipv6. The getaddrinfo() is the new primary lookup function and a connection request from the dbaccess ultimately calls this socket API. You can pass several parameters to the getaddrinfo(), one of those parameter is addrinfo structure. By default, dbaccess passes value “AF_INET6” for addrinfo.ai_family. The ai_family field indicates the protocol family associated with the request, and will be PF_INET6 for IPv6 or PF_INET for IPv4.

If the ai_family set to AF_INET6 (IPv6) the getaddrinfo() will search the DNS every time. If the ai_family set to AF_INET, then it don't query the DNS server. You can consult the 'man' page for getaddrinfo() for detailed information.

Beginning with Informix 10.00.xC4 and Client SDK 2.90.xC4, the database server checks, on startup, whether IPv6 is supported in the underlying operating system. If IPv6 is supported it is used. If the underlying operating system does not support IPv6, the IPv4 address is used.

Diagnosing The Problem

You can use the attached 'C' program to reproduce and verify the problem outside Informix.

/* 
 * gt_adr_info.c - An example of using getaddrinfo() function to validate IPV6.
 *                 compile: cc -o gt_adr_info  gt_adr_info.c 
 *                 usage: gt_adr_info <HostName>
 */

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <netdb.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <arpa/inet.h>

int
lookup_host (const char *host)
{
  struct addrinfo hints, *res;
  int err;
  char addrstr[100];
  void *ptr;
  char *result;

  memset (&hints, 0, sizeof (hints));

  result = getenv("IFX_DISABLE_IPV6");

  if (result != NULL)
      hints.ai_family = AF_INET;
  else
      hints.ai_family = AF_INET6;

  hints.ai_flags = AI_PASSIVE;

  if ((err = getaddrinfo(host, NULL, &hints, &res)) != 0)
    {
      perror ("getaddrinfo");
      return -1;
    }

  printf ("HostName: %s\n", host);
  while (res)
    {
      inet_ntop (res->ai_family, res->ai_addr->sa_data, addrstr, 100);

      switch (res->ai_family)
        {
        case AF_INET:
          ptr = &((struct sockaddr_in *) res->ai_addr)->sin_addr;
          break;
        case AF_INET6:
          ptr = &((struct sockaddr_in6 *) res->ai_addr)->sin6_addr;
          break;
        }
      inet_ntop (res->ai_family, ptr, addrstr, 100);
      printf ("IPv%d address: %s (%s)\n", res->ai_family == PF_INET6 ? 6 : 4,
              addrstr, res->ai_canonname);
      res = res->ai_next;
    }

  return 0;
}

int
main (int argc, char *argv[])
{
  if (argc < 2)
    exit (1);
  return lookup_host (argv[1]);
}
gt_adr_info.c.txt

How to use the attached 'C' program?

1. Save the attached file as 'gt_adr_info.c'.

2, Compile the 'C' program using following command:


3. Run it as 'gt_adr_info <HostName>', where HostName is the machine name that you want to connect.

Resolving The Problem

In case of a problem with DNS lookup and encountering slow connection to database, you may use the environment variable IFX_DISABLE_IPV6 (IFX_DISABLE_IPV6=1) to disable Ipv6 and this will set the ai_family to AF_INET only and will not do subsequent query to the DNS server.

 

https://www.ibm.com/support/pages/possible-cause-slow-connection-database-server

 

A possible cause for slow connection to the database server

A possible cause for slow connection to the database server

www.ibm.com

 

728x90

+ Recent posts