Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-74434

Kubelet configuration directory is not created on scaled out node

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • 4.22.0
    • 4.22.0
    • Node / Kubelet
    • None
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • Proposed
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      After scaling out a node, the node is missing the kubelet configuration directory. The setup is managed by GitOps ZTP.

      ssh core@192.168.112.19 "journalctl -u kubelet"
      Jan 19 19:28:41 appworker-1.blueprint-cwl.nokia-stamp705.bos2.lab kubenswrapper[16815]: E0119 19:28:41.976024   16815 run.go:72] "command failed" err="failed to merge kubelet configs: failed to walk through kubelet dropin directory \"/etc/openshift/kubelet.conf.d\": lstat /etc/openshift/kubelet.conf.d: no such file or directory"

      Version-Release number of selected component (if applicable):

      OCP: 4.22.0-ec.0
      ACM: 2.16.0-126
      MCE: 2.11.0-155

      How reproducible:

      Reproducible each time. I've reproduced this on worker, gateway, and storage nodes.

      Steps to Reproduce:
      Follow ACM documentation for worker node scale in and scale out:
      https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.15/html-single/multicluster_engine_operator_with_red_hat_advanced_cluster_management/index#scale-add-annotation

      https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.15/html-single/multicluster_engine_operator_with_red_hat_advanced_cluster_management/index#scale-out-annotation

          1. Scale in worker node by adding annotation and pruneManifests to git repo. 
          2. Verify node is scaled in successfully from hub and spoke cluster     
          3. Re-add worker node to git repo to trigger scale out.
          4. Worker node is not seen in "oc get nodes"      

      Checking kubelet logs on worker node states that kubelet configuration directory is not created. 

      Actual results:
      appworker-1 is the node being scaled in/out

      [root@Nokia-Rack705-Jumphost ~]# oc get nodes
      NAME                                                STATUS   ROLES                              AGE    VERSION
      appworker-0.blueprint-cwl.nokia-stamp705.bos2.lab   Ready    appworker,appworker-mcp-a,worker   4d1h   v1.34.2
      appworker-2.blueprint-cwl.nokia-stamp705.bos2.lab   Ready    appworker,appworker-mcp-b,worker   4d1h   v1.34.2
      appworker-3.blueprint-cwl.nokia-stamp705.bos2.lab   Ready    appworker,appworker-mcp-b,worker   4d1h   v1.34.2
      gateway-0.blueprint-cwl.nokia-stamp705.bos2.lab     Ready    gateway,gateway-mcp-a,worker       4d1h   v1.34.2
      gateway-1.blueprint-cwl.nokia-stamp705.bos2.lab     Ready    gateway,gateway-mcp-a,worker       4d1h   v1.34.2
      master-0.blueprint-cwl.nokia-stamp705.bos2.lab      Ready    control-plane,master,monitor       4d1h   v1.34.2
      master-1.blueprint-cwl.nokia-stamp705.bos2.lab      Ready    control-plane,master,monitor       4d1h   v1.34.2
      master-2.blueprint-cwl.nokia-stamp705.bos2.lab      Ready    control-plane,master,monitor       4d1h   v1.34.2
      storage-0.blueprint-cwl.nokia-stamp705.bos2.lab     Ready    storage,worker                     4d1h   v1.34.2
      storage-1.blueprint-cwl.nokia-stamp705.bos2.lab     Ready    storage,worker                     4d1h   v1.34.2
      storage-2.blueprint-cwl.nokia-stamp705.bos2.lab     Ready    storage,worker                     4d1h   v1.34.2
      storage-3.blueprint-cwl.nokia-stamp705.bos2.lab     Ready    storage,worker                     4d1h   v1.34.2
      
      [root@Nokia-Rack705-Jumphost ~]# oc get bmh -A
      NAMESPACE               NAME                                                STATE       CONSUMER                                                          ONLINE   ERROR   AGE
      openshift-machine-api   appworker-0.blueprint-cwl.nokia-stamp705.bos2.lab   unmanaged   blueprint-cwl-zfv6r-worker-0-bkk2f                                true             4d1h
      openshift-machine-api   appworker-1.blueprint-cwl.nokia-stamp705.bos2.lab   unmanaged   blueprint-cwl-appworker-1.blueprint-cwl.nokia-stamp705.bos2.lab   true             22m
      openshift-machine-api   appworker-2.blueprint-cwl.nokia-stamp705.bos2.lab   unmanaged   blueprint-cwl-zfv6r-worker-0-d7lpq                                true             4d1h
      openshift-machine-api   appworker-3.blueprint-cwl.nokia-stamp705.bos2.lab   unmanaged   blueprint-cwl-zfv6r-worker-0-ds89x                                true             4d1h
      openshift-machine-api   gateway-0.blueprint-cwl.nokia-stamp705.bos2.lab     unmanaged   blueprint-cwl-zfv6r-worker-0-k542z                                true             4d1h
      openshift-machine-api   gateway-1.blueprint-cwl.nokia-stamp705.bos2.lab     unmanaged   blueprint-cwl-zfv6r-worker-0-l4sv2                                true             4d1h
      openshift-machine-api   master-0.blueprint-cwl.nokia-stamp705.bos2.lab      unmanaged   blueprint-cwl-zfv6r-master-0                                      true             4d1h
      openshift-machine-api   master-1.blueprint-cwl.nokia-stamp705.bos2.lab      unmanaged   blueprint-cwl-zfv6r-master-1                                      true             4d1h
      openshift-machine-api   master-2.blueprint-cwl.nokia-stamp705.bos2.lab      unmanaged   blueprint-cwl-zfv6r-master-2                                      true             4d1h
      openshift-machine-api   storage-0.blueprint-cwl.nokia-stamp705.bos2.lab     unmanaged   blueprint-cwl-zfv6r-worker-0-pqg4l                                true             4d1h
      openshift-machine-api   storage-1.blueprint-cwl.nokia-stamp705.bos2.lab     unmanaged   blueprint-cwl-zfv6r-worker-0-r8qbx                                true             4d1h
      openshift-machine-api   storage-2.blueprint-cwl.nokia-stamp705.bos2.lab     unmanaged   blueprint-cwl-zfv6r-worker-0-rrn69                                true             4d1h
      openshift-machine-api   storage-3.blueprint-cwl.nokia-stamp705.bos2.lab     unmanaged   blueprint-cwl-zfv6r-worker-0-sd9f2                                true             4d1h

      Expected results:

      Node gets scaled out successfully

      Additional info:
      Creating the kubelet configuration directory allows the node to become provisioned but it is not getting the correct machineset configuration. 

      I've reproduced the issue on appworker-1, gateway-1, and storage-0.

      ssh core@<node_ip> "sudo mkdir -p /etc/openshift/kubelet.conf.d && sudo systemctl restart kubelet"
      
      # from spoke
      [root@Nokia-Rack705-Jumphost ~]# oc get bmh -A
      NAMESPACE               NAME                                                STATE       CONSUMER                                                          ONLINE   ERROR   AGE
      openshift-machine-api   appworker-0.blueprint-cwl.nokia-stamp705.bos2.lab   unmanaged   blueprint-cwl-zfv6r-worker-0-bkk2f                                true             5d6h
      openshift-machine-api   appworker-1.blueprint-cwl.nokia-stamp705.bos2.lab   unmanaged   blueprint-cwl-appworker-1.blueprint-cwl.nokia-stamp705.bos2.lab   true             28h
      openshift-machine-api   appworker-2.blueprint-cwl.nokia-stamp705.bos2.lab   unmanaged   blueprint-cwl-zfv6r-worker-0-d7lpq                                true             5d6h
      openshift-machine-api   appworker-3.blueprint-cwl.nokia-stamp705.bos2.lab   unmanaged   blueprint-cwl-zfv6r-worker-0-ds89x                                true             5d6h
      openshift-machine-api   gateway-0.blueprint-cwl.nokia-stamp705.bos2.lab     unmanaged   blueprint-cwl-zfv6r-worker-0-k542z                                true             5d6h
      openshift-machine-api   gateway-1.blueprint-cwl.nokia-stamp705.bos2.lab     unmanaged   blueprint-cwl-gateway-1.blueprint-cwl.nokia-stamp705.bos2.lab     true             3h47m
      openshift-machine-api   master-0.blueprint-cwl.nokia-stamp705.bos2.lab      unmanaged   blueprint-cwl-zfv6r-master-0                                      true             5d6h
      openshift-machine-api   master-1.blueprint-cwl.nokia-stamp705.bos2.lab      unmanaged   blueprint-cwl-zfv6r-master-1                                      true             5d6h
      openshift-machine-api   master-2.blueprint-cwl.nokia-stamp705.bos2.lab      unmanaged   blueprint-cwl-zfv6r-master-2                                      true             5d6h
      openshift-machine-api   storage-0.blueprint-cwl.nokia-stamp705.bos2.lab     unmanaged   blueprint-cwl-storage-0.blueprint-cwl.nokia-stamp705.bos2.lab     true             116m
      openshift-machine-api   storage-1.blueprint-cwl.nokia-stamp705.bos2.lab     unmanaged   blueprint-cwl-zfv6r-worker-0-r8qbx                                true             5d6h
      openshift-machine-api   storage-2.blueprint-cwl.nokia-stamp705.bos2.lab     unmanaged   blueprint-cwl-zfv6r-worker-0-rrn69                                true             5d6h
      openshift-machine-api   storage-3.blueprint-cwl.nokia-stamp705.bos2.lab     unmanaged   blueprint-cwl-zfv6r-worker-0-sd9f2                                true             5d6h
      
      # from hub
      [root@Nokia-Rack705-Jumphost ~]# oc get bmh -A
      NAMESPACE               NAME                                                STATE         CONSUMER                       ONLINE   ERROR   AGE
      blueprint-cwl           appworker-0.blueprint-cwl.nokia-stamp705.bos2.lab   provisioned                                  true             5d7h
      blueprint-cwl           appworker-1.blueprint-cwl.nokia-stamp705.bos2.lab   provisioned                                  true             30h
      blueprint-cwl           appworker-2.blueprint-cwl.nokia-stamp705.bos2.lab   provisioned                                  true             5d7h
      blueprint-cwl           appworker-3.blueprint-cwl.nokia-stamp705.bos2.lab   provisioned                                  true             5d7h
      blueprint-cwl           gateway-0.blueprint-cwl.nokia-stamp705.bos2.lab     provisioned                                  true             5d7h
      blueprint-cwl           gateway-1.blueprint-cwl.nokia-stamp705.bos2.lab     provisioned                                  true             5h24m
      blueprint-cwl           master-0.blueprint-cwl.nokia-stamp705.bos2.lab      provisioned                                  true             5d7h
      blueprint-cwl           master-1.blueprint-cwl.nokia-stamp705.bos2.lab      provisioned                                  true             5d7h
      blueprint-cwl           master-2.blueprint-cwl.nokia-stamp705.bos2.lab      provisioned                                  true             5d7h
      blueprint-cwl           storage-0.blueprint-cwl.nokia-stamp705.bos2.lab     provisioned                                  true             3h31m
      blueprint-cwl           storage-1.blueprint-cwl.nokia-stamp705.bos2.lab     provisioned                                  true             5d7h
      blueprint-cwl           storage-2.blueprint-cwl.nokia-stamp705.bos2.lab     provisioned                                  true             5d7h
      blueprint-cwl           storage-3.blueprint-cwl.nokia-stamp705.bos2.lab     provisioned                                  true             5d7h
      openshift-machine-api   master-0                                            unmanaged     hubcluster-hp-czvc5-master-0   true             6d22h
      openshift-machine-api   master-1                                            unmanaged     hubcluster-hp-czvc5-master-1   true             6d22h
      openshift-machine-api   master-2                                            unmanaged     hubcluster-hp-czvc5-master-2   true             6d22h

       

              rh-ee-ngopalak Neeraj Krishna Gopalakrishna
              rh-ee-ktsai Kevin Tsai
              Aruna Naik, Neelesh Agrawal, Neeraj Krishna Gopalakrishna, Xiaoli Tian
              None
              Min Li Min Li
              None
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Created:
                Updated: