Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-31444

Wrong dnsPolicy is used for konnectivity-agent in data plane

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Normal Normal
    • 4.16.0
    • 4.14, 4.15
    • HyperShift
    • Moderate
    • No
    • False
    • Hide

      None

      Show
      None
    • Hide
      * Previously, the `konnectivity-agent` daemonset used the `ClusterIP` DNS policy. As a result, when CoreDNS was down, the `konnectivity-agent` pods on the data plane could not resolve the proxy-server-address, and they could fail to `konnectivity-server` in the control plane. With this update, the `konnectivity-agent` daemonset was modified to use `dnsPolicy: Default`. The `konnectivity-agent` uses the host system DNS service to look up the proxy server address, and it does not depend on CoreDNS anymore. (link:https://issues.redhat.com/browse/OCPBUGS-31444[*OCPBUGS-31444*])
      Show
      * Previously, the `konnectivity-agent` daemonset used the `ClusterIP` DNS policy. As a result, when CoreDNS was down, the `konnectivity-agent` pods on the data plane could not resolve the proxy-server-address, and they could fail to `konnectivity-server` in the control plane. With this update, the `konnectivity-agent` daemonset was modified to use `dnsPolicy: Default`. The `konnectivity-agent` uses the host system DNS service to look up the proxy server address, and it does not depend on CoreDNS anymore. (link: https://issues.redhat.com/browse/OCPBUGS-31444 [* OCPBUGS-31444 *])
    • Bug Fix
    • Done

      Description of problem:

      The konnectivity-agent on the data plane needs to resolve its proxy-server-url to connect the control plane's konnectivity server. Also, the these agents are using the default dnsPolicy which is ClusterFirst.
      
      This creates a dependency with CoreDNS. If CoreDNS is misconfigured or down, agents won't able to connect to the server, and all konnectivity related traffic goes down (blocks updates, webhooks, logs, etc).
      
      The correction would to use the dnsPolicy: Default in the konnectivity-agent daemonset on the data plane, so it would use the name resolution configuration from the node.
      
      This makes sure that the konnectivity-agent's proxy-server-url can be resolved even if coreDNS is down or mis-configured
      
      The konnectivity-agent control plane deployment shall not change as it still needs to use coreDNS as in that case a ClusterIP Service is configured as proxy-server-url.   

      Version-Release number of selected component (if applicable):

      4.14, 4.15
      
          

      How reproducible:

      Break coreDNS configuration

      Steps to Reproduce:

      1. Put an invalid forwarder to the dns.operator/default to fail upstream DNS resolving
      2. Rollout restart the konnectivity-agent daemonset in kube-system

      Actual results:

      kubectl log is failing

      Expected results:

      kubectl log is working

      Additional info:

       

            adam.mihelcsik Adam Mihelcsik
            adam.mihelcsik Adam Mihelcsik
            Jie Zhao Jie Zhao
            Laura Hinson Laura Hinson
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: