Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-6513

TCP Queries hang forever when an upstream server is not reachable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • rhel-8.7.0
    • dnsmasq
    • None
    • Important
    • rhel-sst-cs-net-perf-services
    • ssg_core_services
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • If docs needed, set a value
    • None

      Description of problem:

      A customer is using dnsmasq with 3 upstream servers.
      When one of them is not reachable, queries hang until they time out.
      This happens even though --all-servers is used, which is supposed to send the query to all servers concurrently, at least from the manpage:
      -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
      --all-servers
      By default, when dnsmasq has more than one upstream server available, it will send queries to just
      one server. Setting this flag forces dnsmasq to send all queries to all available servers. The reply
      from the server which answers first will be returned to the original requester.
      -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

      Stracing dnsmasq, we can see indeed that it hangs on connect() until the daemon was killed:
      -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
      8731 14:27:08.458095 connect(13<TCP:[432416]>,

      {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("1.2.3.4")}

      , 16 <unfinished ...>
      :
      8731 14:29:05.145298 <... connect resumed>) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) <116.687148>
      8731 14:29:05.145373 — SIGINT

      {si_signo=SIGINT, si_code=SI_KERNEL}


      -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

      Version-Release number of selected component (if applicable):

      dnsmasq-2.79-24.el8.x86_64 (also seen on RHEL9 dnsmasq-2.85-5.el9.x86_64)

      How reproducible:

      Always

      Steps to Reproduce:
      1. Setup dnsmasq with upstream servers 192.168.122.1 (my VM gateway) and 1.2.3.4 (not reachable)

      -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

      1. dnsmasq -k --conf-file=/dev/null --port 2053 --server 192.168.122.1 --server 1.2.3.4 -i lo -z --all-servers --no-resolv --no-hosts
                    • 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

      2. Query using dig

      -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

      1. dig +tcp @localhost -p 2053 srv foo.bar

      ; <<>> DiG 9.11.36-RedHat-9.11.36-5.el8_7.2 <<>> +tcp @localhost -p 2053 srv foo.bar
      ; (2 servers found)
      ;; global options: +cmd
      ;; connection timed out; no servers could be reached
      -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

      Actual results:

      Time out, no result

      Expected results:

      Some result

      Additional info:

      When inversing -server options (-server 1.2.3.4 --server 192.168.122.1), we see the query being answered immediately, which "proves" 192.168.122.1 is queried first, and for sure nothing is queried in parallel.

      ss shows that both children query the same server, which is not reachable:
      -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

      1. ss -anp | grep SYN
        tcp SYN-SENT 0 1 192.168.122.184:56355 1.2.3.4:53 users("dnsmasq",pid=9373,fd=13))
        tcp SYN-SENT 0 1 192.168.122.184:57173 1.2.3.4:53 users("dnsmasq",pid=9370,fd=13))
                    • 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

      This happens because dig internally retries the query upon not getting any result.

              pemensik@redhat.com Petr Mensik
              rhn-support-rmetrich Renaud Métrich
              Petr Mensik Petr Mensik
              rhel-cs-infra-services-qe rhel-cs-infra-services-qe rhel-cs-infra-services-qe rhel-cs-infra-services-qe
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated: