Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-100983

"host" command doesn't always return even though some DNS server provided an answer

Linking RHIVOS CVEs to...Migration: Automation ...Sync from "Extern...XMLWordPrintable

    • No
    • Low
    • rhel-se-cs-infra-services
    • 3
    • False
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • Unspecified
    • Unspecified
    • Unspecified
    • x86_64
    • None

      What were you trying to do that didn't work?

      The command "host <some hostname>" frequently returns without a status. Below is the configuration of /etc/resolv.conf.

       

      options timeout:1 attempts:1 rotate
      nameserver DNSSERVER1
      nameserver DNSSERVER2
      nameserver DNSSERVER3
      nameserver DNSSERVER4 

       

      If we collect a coredump file at this time, we will see that the host command is stuck in the epoll_wait() function.

       

      Core was generated by `host'.
      #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
      185    62:    movl    (%rsp), %edi
      (gdb) bt
      #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
      #1  0x00007f8cacb2635b in dispatch (manager=0x7f8cad9bc010, qid=qid@entry=isc_taskqueue_normal) at ../../../lib/isc/task.c:1095
      #2  0x00007f8cacb2710b in run_normal (uap=<optimized out>) at ../../../lib/isc/task.c:1325
      #3  0x00007f8cab6cbea5 in start_thread (arg=0x7f8ca975d700) at pthread_create.c:307
      #4  0x00007f8caa73eb0d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
      (gdb) info threads 
        Id   Target Id         Frame 
        5    Thread 0x7f8cad9f8840 (LWP 8957) 0x00007f8caa676702 in do_sigsuspend (set=0x7ffc698aaf20) at ../sysdeps/unix/sysv/linux/sigsuspend.c:31
        4    Thread 0x7f8ca7f5a700 (LWP 8963) 0x00007f8caa73f0e3 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
        3    Thread 0x7f8ca875b700 (LWP 8962) pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
        2    Thread 0x7f8ca8f5c700 (LWP 8961) pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
      * 1    Thread 0x7f8ca975d700 (LWP 8960) pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
      (gdb) t 4
      [Switching to thread 4 (Thread 0x7f8ca7f5a700 (LWP 8963))]
      #0  0x00007f8caa73f0e3 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
      81    T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
      (gdb) bt
      #0  0x00007f8caa73f0e3 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
      #1  0x00007f8cacb3b616 in watcher (uap=0x7f8cad9c0010) at ../../../../lib/isc/unix/socket.c:4258
      #2  0x00007f8cab6cbea5 in start_thread (arg=0x7f8ca7f5a700) at pthread_create.c:307
      #3  0x00007f8caa73eb0d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
       

       

      We can also collect strace output, which shows the following symptoms. The query is made to 127.0.0.1 and DNSSERVER1 in a row, then 1 second later to DNSSERVER2 and DNSSERVER3:

       

      5427  09:16:17.589619 epoll_wait(5<anon_inode:[eventpoll]>,  <unfinished ...>
      5424  09:16:17.598288 sendmsg(6<UDP:[127.0.0.1:35502]>, {msg_name={sa_family=AF_INET, sin_port=htons(35502), sin_addr=inet_addr("127.0.0.1")}, msg_namelen=16, msg_iov=[{iov_base="\0", iov_len=1}], msg_iovlen=1, msg_control=[{cmsg_len=17, cmsg_level=SOL_IP, cmsg_type=IP_TOS, cmsg_data=[0xb8]}], msg_controllen=24, msg_flags=0}, 0) = -1 EINVAL (Invalid argument) <0.000015>
      5424  09:16:17.605705 recvmsg(20<UDP:[0.0.0.0:42685]>, {msg_namelen=128}, 0) = -1 EAGAIN (Resource temporarily unavailable) <0.000014>
      5424  09:16:17.606074 sendmsg(20<UDP:[0.0.0.0:42685]>, {msg_name={sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("<IP1>")}, msg_namelen=16, msg_iov=[{iov_base="\20{\1\0\0\1\0\0\0\0\0\0\<hostname>\0\0\1\0\1", iov_len=28}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0 <unfinished ...>
      5427  09:16:17.606342 epoll_wait(5<anon_inode:[eventpoll]>, [{EPOLLIN, {u32=20, u64=20}}], 2048, -1) = 1 <0.000790>
      5427  09:16:17.607431 epoll_wait(5<anon_inode:[eventpoll]>,  <unfinished ...>
      5424  09:16:17.607458 recvmsg(20<UDP:[0.0.0.0:42685]>, {msg_name={sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("<IP1>")}, msg_namelen=128->16, msg_iov=[{iov_base="\20{\201\202\0\1\0\0\0\0\0\0\<hostname>\0\0\1\0\1", iov_len=65535}], msg_iovlen=1, msg_control=[{cmsg_len=32, cmsg_level=SOL_SOCKET, cmsg_type=SCM_TIMESTAMP, cmsg_data={tv_sec=1750986977, tv_usec=607082}}, {cmsg_len=17, cmsg_level=SOL_IP, cmsg_type=IP_TOS, cmsg_data=[0]}], msg_controllen=56, msg_flags=0}, 0) = 28 <0.000024>
      5424  09:16:17.610671 recvmsg(21<UDP:[0.0.0.0:57561]>, {msg_namelen=128}, 0) = -1 EAGAIN (Resource temporarily unavailable) <0.000014>
      5424  09:16:17.611011 sendmsg(21<UDP:[0.0.0.0:57561]>, {msg_name={sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("<IP2>")}, msg_namelen=16, msg_iov=[{iov_base="\20{\1\0\0\1\0\0\0\0\0\0\<hostname>\0\0\1\0\1", iov_len=28}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0 <unfinished ...>
      5427  09:16:17.611396 epoll_wait(5<anon_inode:[eventpoll]>,  <unfinished ...>
      5427  09:16:17.611961 epoll_wait(5<anon_inode:[eventpoll]>,  <unfinished ...>
      5424  09:16:18.611900 recvmsg(20<UDP:[0.0.0.0:56378]>, {msg_namelen=128}, 0) = -1 EAGAIN (Resource temporarily unavailable) <0.000019>
      5424  09:16:18.612273 sendmsg(20<UDP:[0.0.0.0:56378]>, {msg_name={sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("<IP3>")}, msg_namelen=16, msg_iov=[{iov_base="\20{\1\0\0\1\0\0\0\0\0\0\<hostname>\0\0\1\0\1", iov_len=28}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0 <unfinished ...>
      5427  09:16:18.612808 epoll_wait(5<anon_inode:[eventpoll]>, [{EPOLLIN, {u32=20, u64=20}}], 2048, -1) = 1 <0.000572>
      5427  09:16:18.613758 epoll_wait(5<anon_inode:[eventpoll]>,  <unfinished ...>
      5424  09:16:18.613792 recvmsg(20<UDP:[0.0.0.0:56378]>, {msg_name={sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("<IP3>")}, msg_namelen=128->16, msg_iov=[{iov_base="\20{\201\202\0\1\0\0\0\0\0\0\<hostname>\0\0\1\0\1", iov_len=65535}], msg_iovlen=1, msg_control=[{cmsg_len=32, cmsg_level=SOL_SOCKET, cmsg_type=SCM_TIMESTAMP, cmsg_data={tv_sec=1750986978, tv_usec=613339}}, {cmsg_len=17, cmsg_level=SOL_IP, cmsg_type=IP_TOS, cmsg_data=[0]}], msg_controllen=56, msg_flags=0}, 0) = 28 <0.000057>
      5427  09:16:18.614878 epoll_wait(5<anon_inode:[eventpoll]>,  <unfinished ...> <<<-------------- HERE
      5427  09:17:55.910254 epoll_wait(5<anon_inode:[eventpoll]>,  <unfinished ...> 

       

      However, the host command never returns.

      Seems the cancellation code is not always working.

      What is the impact of this issue to you?

      Unable to use the host command.

      Please provide the package NVR for which the bug is seen:

      bind-utils-9.11.4-26.P2.el7_9.18.x86_64

      How reproducible is this bug?:

      Very easy.

      Steps to reproduce

      1. RHEL7.9 ELS
      2.  
      3.  

      Expected results

      host command should return.

      Actual results

      host command is stuck.

              rhn-support-bjmason Bryan Mason
              rhn-support-qguo Qi Guo
              Bryan Mason Bryan Mason
              se-cs-infra-services se-cs-infra-services se-cs-infra-services se-cs-infra-services
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated: