Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-50605

K-nmstate does not modify static DNS in nmconnection file (was: NMState taking too much time to reapply the NNCE after node reboot)

XMLWordPrintable

    • Quality / Stability / Reliability
    • True
    • Hide

      Waiting for feedback re: solution proposed on 2025/07/30 10:18 AM

      Show
      Waiting for feedback re: solution proposed on 2025/07/30 10:18 AM
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      Some background about the issue:

      Lets say users are modifying `/etc/resolv.conf` of the nodes to add a search domain `example.com` using the NMState operator using the below NNCP:

      apiVersion: nmstate.io/v1
      kind: NodeNetworkConfigurationPolicy
      metadata:
        name: worker-dns-modify
      spec:
        nodeSelector:
          node-role.kubernetes.io/worker: ""
        desiredState:
          dns-resolver:
            config:
              search:
              - example.com 

      The NNCP gets applied successfully. But whenever the node reboots, maybe due to some patching or MCO rollout or manual reboot for that matter,  whenever node reboots, as per design, the configuration done using the NNCP gets reverted. And once the node gets into READY state, the NNCP and NNCE are re-applied on the nodes, which is expected.  Now, many customers have observed that sometimes NNCE takes around 5 minutes to apply on the node after it becomes READY, while sometimes it gets applied within 2 minutes.

      Now What Is the problem we are facing:
      In the test cluster, while reproducing, the issue here is that as soon as the nodes become READY, pods start to schedule on the node.  Now, at this point, right after reboot, NMState is parallelly applying the NNCP and NNCE.  However, the nodes still have old DNS settings applied in `/etc/resolv.conf` and which is why the pods pick up the old DNS configuration from the nodes. 

      For instance, when we do `$ oc rsh` into the twistlock pod, we can see that this pod which was created after reboot but before NMState applied changes has no `example.com` search domain

      $ oc get po -o wide | grep -i <node-name>
      twistlock-defender-xx-xxxxx 1/1 Running 1 8d 10.xx.xx.xx <node-name> <none> <none>
      $ oc rsh twistlock-defender-xx-xxxxx
      sh-5.1# cat /etc/resolv.conf
      search twistlock-example.svc.cluster.local svc.cluster.local cluster.local
      nameserver 10.xx.xx.xx
      options ndots:5
      sh-5.1#
      exit

      After the NNCE finished successfully and pod is restarted, the pods now take updated DNS setting from the nodes /etc/resolv.conf as we can now see this new search domain `example.com` is now available :

       

      $ oc delete po twistlock-defender-xx-xxxxx
      pod "twistlock-defender-xx-xxxxx" deleted
      $ oc get po -o wide | grep -i <node-name>
      twistlock-defender-xx-xxxxx 1/1 Running 0 10s 10.xx.xx.xx <node-name> <none> <none>
      $ oc rsh twistlock-defender-xx-xxxxx
      sh-5.1# cat /etc/resolv.conf
      search twistlock-example.svc.cluster.local svc.cluster.local cluster.local example.com
      nameserver 10.xx.xx.xx
      options ndots:5
      sh-5.1#
       
      

      The Challenge we are Facing due to this issue:

      It has been observed that, after node reboot, everytime NNCE gets re-applied and it takes nearly 3-4 minutes on average to apply the desired DNS configuration on the nodes /etc/resolv.conf.

      During this time, pods have already started to schedule and are running on the node as the Node is reporting READY.

      Now,  Meanwhile, NMState is parallelly applying the NNCE on the nodes and it takes nearly 3-4 mins to apply the changes, by that time the pods have already taken the old DNS configuration. So, it becomes extra work for users to every time restart pods after NNCE gets applied, so that the pods get new DNS configuration from the node.

      In one of the customer scenario, they had nearly 3000 pods running on this node and restarting all the pods becomes an additional overhead. Also, lets assume the time calculation in this whole process,  where node reboot takes nearly 10 minutes to report READY state and NMState takes additional 3-4 minutes and then after that restarting all 3000 pods would take 3-4 minutes as well. So, for one reboot of the node it takes 18 minutes on average to make it fully operational.

      If for one node, it's taking 18-20 minutes just after a single reboot, let's assume how big the problem here is if we talk about large clusters, having 50-60 nodes. A simple patching using MCO rollout will reboot nodes one by one, and consequently, after that NNCE is applied on all nodes, and after that user also has to again restart the pods of the node that came up into READY just now. This process is not very optimized.

      For customer not using DHCP, rather using static DNS config configured at the time of node addition into the cluster. And the customer is looking to add additional search domain `example.com` or any search domain for that matter, in the node's /etc/resolv.conf through NMState, this problem is very troublesome.

      A possible solution to this problem:
      The NNCE gets re-applied after each reboot and the challenge here is pods are scheduled first before NNCE finishes. Here, to optimize this what we can do is, lets say after reboot, the NMhandler daemon set will still run on the node and those NMState handler pods put the node to remain cordoned until NNCE is re-applied, and once NNCE is applied successfully, the nodes are uncordoned, and pods are now allowed to schedule and takes new DNS configuration automatically from the nodes. So, this could be a better way or some code level changes we can do. This is just a suggestion from my side.

      I had a detailed discussion on this on Slack (Kubernetes-nmstate channel): https://redhat-internal.slack.com/archives/CP7329Z5Z/p1739286929445249?thread_ts=1739268087.035259&cid=CP7329Z5Z

      A solution proposed here was customer will have nmconnection file with static IP and static DNS on the nodes. In the current existing configuration customer wants to add more DNS search domain into the nodes /etc/resolv.conf and is using NMState for that purpose.
      If customers can directly add more DNS search domains into the nmconnection file directly then this will persist across reboot and as soon as the nodes are rebooted, the pods that will schedule will get new updated DNS settings from the node. So, this shall solve the problem. 

       

      Version-Release number of selected component (if applicable):

      Kubernetes NMState operator

      How reproducible:

          Always

      Steps to Reproduce:

          1. Apply NNCP on node to modify or add new search domains in nodes /etc/resolv.conf
          2. Reboot the node and observe the NNCE are reapplied on the nodes. 
          3. Only after NNCE is re-applied, the new DNS search domains will be visible. 
          

      Actual results:

          Only after NNCE is re-applied, the new DNS search domains will be visible. 

      Expected results:

          The nodes should have persistent way to modify the /etc/resolv.conf after reboots.

      Additional info:

          More details are covered in the Slack thread shared above.

              mkowalsk@redhat.com Mat Kowalski
              rhn-support-mmarkand Mridul Markandey
              Mridul Markandey
              None
              Ross Brattain Ross Brattain
              None
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: