Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-12994

TCP DNS Local Preference is not working for Openshift SDN

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • None
    • 4.13, 4.12, 4.11
    • None
    • Critical
    • No
    • SDN Sprint 235
    • 1
    • Rejected
    • False
    • Hide

      None

      Show
      None

      This is a clone of issue OCPBUGS-9985. The following is the description of the original issue:

      Description of problem:

      DNS Local endpoint preference is not working for TCP DNS requests for Openshift SDN.
      
      Reference code: https://github.com/openshift/sdn/blob/b58a257b896d774e0a092612be250fb9414af5ca/vendor/k8s.io/kubernetes/pkg/proxy/iptables/proxier.go#L999-L1012
      
      This is where the DNS request is short-circuited to the local DNS endpoint if it exists. This is important because DNS local preference protects against another outstanding bug, in which daemonset pods go stale for a few second upon node shutdown (see https://issues.redhat.com/browse/OCPNODE-549 for fix for graceful node shutdown). This appears to be contributing to DNS issues in our internal CI clusters. https://lookerstudio.google.com/reporting/3a9d4e62-620a-47b9-a724-a5ebefc06658/page/MQwFD?s=kPTlddLa2AQ shows large amounts of "dns_tcp_lookup" failures, which I attribute to this bug.
      
      UDP DNS local preference is working fine in Openshift SDN. Both UDP and TCP local preference work fine in OVN. It's just TCP DNS Local preference that is not working Openshift SDN.

      Version-Release number of selected component (if applicable):

      4.13, 4.12, 4.11

      How reproducible:

      100%

      Steps to Reproduce:

      1. oc debug -n openshift-dns
      2. dig +short +tcp +vc +noall +answer CH TXT hostname.bind
      # Retry multiple times, and you should always get the same local DNS pod.

      Actual results:

      [gspence@gspence origin]$ oc debug -n openshift-dns
      Starting pod/image-debug ...
      Pod IP: 10.128.2.10
      If you don't see a command prompt, try pressing enter.
      sh-4.4# dig +short +tcp +vc +noall +answer CH TXT hostname.bind
      "dns-default-glgr8"
      sh-4.4# dig +short +tcp +vc +noall +answer CH TXT hostname.bind
      "dns-default-gzlhm"
      sh-4.4# dig +short +tcp +vc +noall +answer CH TXT hostname.bind
      "dns-default-dnbsp"
      sh-4.4# dig +short +tcp +vc +noall +answer CH TXT hostname.bind
      "dns-default-gzlhm"
      

      Expected results:

      [gspence@gspence origin]$ oc debug -n openshift-dns
      Starting pod/image-debug ...
      Pod IP: 10.128.2.10
      If you don't see a command prompt, try pressing enter.
      sh-4.4# dig +short +tcp +vc +noall +answer CH TXT hostname.bind
      "dns-default-glgr8"
      sh-4.4# dig +short +tcp +vc +noall +answer CH TXT hostname.bind
      "dns-default-glgr8"
      sh-4.4# dig +short +tcp +vc +noall +answer CH TXT hostname.bind
      "dns-default-glgr8"
      sh-4.4# dig +short +tcp +vc +noall +answer CH TXT hostname.bind
      "dns-default-glgr8" 

      Additional info:

      https://issues.redhat.com/browse/OCPBUGS-488 is the previous bug I opened for UDP DNS local preference not working.
      
      iptables-save from a 4.13 vanilla cluster bot AWS,SDN: https://drive.google.com/file/d/1jY8_f64nDWi5SYT45lFMthE0vhioYIfe/view?usp=sharing 

              mkennell@redhat.com Martin Kennelly
              openshift-crt-jira-prow OpenShift Prow Bot
              Mike Fiedler Mike Fiedler
              Marc Curry
              Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

                Created:
                Updated:
                Resolved: