Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-11834

4.13.0-rc.2: Adding IPv6 static ip worker node to SNO fails to become Ready and stuck Not Ready

XMLWordPrintable

    • Critical
    • No
    • Proposed
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      k8s.ovn.org/node-chassis-id annotation not found for node sno-3.kni-qe-35.lab.eng.rdu2.redhat.com, macAddress annotation not found for node "sno-3.kni-qe-35.lab.eng.rdu2.redhat.com" , k8s.ovn.org/l3-gateway-config annotation not found for node "sno-3.kni-qe-35.lab.eng.rdu2.redhat.com"]

      Version-Release number of selected component (if applicable):

       

      How reproducible:

      One time, need to retry

      Steps to Reproduce:

      1. Installed 4.13 RC2 using ABI
      2. Follow steps to add worker node manually
      https://docs.openshift.com/container-platform/4.11/nodes/nodes/nodes-sno-worker-nodes.html#sno-adding-worker-nodes-to-single-node-clusters-manually_add-workershttps://issues.redhat.com/browse/OCPBUGS-3053  
      
      Specifically-
      
      export OCP_VERSION=4.13.0-rc.2
      export ARCH=x86_64
      oc extract -n openshift-machine-api secret/worker-user-data --keys=userData --to=- > worker.ignsudo 
      cp worker.ign /var/www/html/ 
      curl -k https://mirror.openshift.com/pub/openshift-v4/clients/ocp/$OCP_VERSION/openshift-install-linux.tar.gz > openshift-install-linux.tar.gz
      tar zxvf openshift-install-linux.tar.gzchmod +x openshift-install
      ISO_URL=$(./openshift-install coreos print-stream-json | grep location | grep $ARCH | grep iso | cut -d\" -f4)curl -L $ISO_URL -o rhcos-live.iso
      curl -L $ISO_URL -o rhcos-live.iso
      
      sudo cp rhcos-live.iso /var/www/html/
      
      # use iDrac virtual media to attach virtual media and boot ISO
      curl -s http://10.1.101.1/rhcos-live.iso
      
      # boot ISO on node
      
      # Configure interface (ipv6)nmcli con mod Wired\ connection\ 2 ipv6.method manual ipv6.addresses 2620:52:0:165::42/124 ipv6.gateway 2620:52:0:165::4e ipv6.dns 2620:52:0:aa0::dead:beef
      # Bring interface upnmcli con up  Wired\ connection\ 2
      
      # You might want to check and make sure you can ping in/out of server
      Checked!
      
      # You may want to test from this worker node console you can reach the webserver, where worker.ign file
      
      curl -s http://[2620:52:0:165::1]]/worker.ign
      
      # Create a new ignition file new-worker.ign that includes a reference to the original worker.ign and an additional instruction that the coreos-installer program uses to populate the /etc/hostname file on the new worker host.
      
      cat /var/www/html/NEW-worker.ign
      {
          "ignition": {
              "version": "3.2.0",
              "config": {
                  "merge": [
                      {
                          "source": "http://[2620:52:0:165::1]/worker.ign"
                      }
                  ]
              }
          },
          "storage": {
              "files": [
                  {
                      "path": "/etc/hostname",
                      "contents": { "source": "data:,sno-3.kni-qe-35.lab.eng.rdu2.redhat.com" },
                      "mode": 420,
                      "overwrite": true,
                      "path": "/etc/hostname"
                 }
              ]
          }
      }
      
      
      # Run coreos-installer
      sudo coreos-installer install --copy-network --ignition-url http://[2620:52:0:165::1]/NEW-worker.ign /dev/sda --insecure-ignition
      
      
      
      # As the installation proceeds, the installation generates pending certificate signing requests (CSRs) for the worker node. When prompted, approve the pending CSRs to complete the installation.
      
      
      oc get csr
      NAME        AGE    SIGNERNAME                                    REQUESTOR                                                                   REQUESTEDDURATION   CONDITION
      csr-87dhw   107s   kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   <none>              Pending
      
      oc adm certificate approve csr-87dhw
      certificatesigningrequest.certificates.k8s.io/csr-87dhw approved
      [kni@registry.kni-qe-31 kni-qe-35-ipv6.bak]$ oc get csr
      NAME        AGE     SIGNERNAME                                    REQUESTOR                                                                   REQUESTEDDURATION   CONDITION
      csr-87dhw   2m30s   kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   <none>              Approved,Issued
      
      oc get csr
      NAME        AGE     SIGNERNAME                                    REQUESTOR                                                                   REQUESTEDDURATION   CONDITION
      csr-87dhw   2m33s   kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   <none>              Approved,Issued
      csr-tgqgd   0s      kubernetes.io/kubelet-serving                 system:node:sno-3.kni-qe-35.lab.eng.rdu2.redhat.com                         <none>              Pending
      
      oc adm certificate approve csr-tgqgd
      certificatesigningrequest.certificates.k8s.io/csr-tgqgd approved
      [kni@registry.kni-qe-31 kni-qe-35-ipv6.bak]$ oc get csr
      NAME        AGE     SIGNERNAME                                    REQUESTOR                                                                   REQUESTEDDURATION   CONDITION
      csr-87dhw   8m8s    kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   <none>              Approved,Issued
      csr-tgqgd   5m35s   kubernetes.io/kubelet-serving                 system:node:sno-3.kni-qe-35.lab.eng.rdu2.redhat.com                         <none>              Approved,Issued
      
      
       oc get nodes
      NAME                                      STATUS     ROLES                         AGE     VERSION
      sno-2                                     Ready      control-plane,master,worker   6h28m   v1.26.2+dc93b13
      sno-3.kni-qe-35.lab.eng.rdu2.redhat.com   NotReady   worker                        154m    v1.26.2+dc93b13
      
      
      
      
      
      
      
      

      Actual results:

      oc describe node sno-3.kni-qe-35.lab.eng.rdu2.redhat.com| grep -A5 Events: Events:   Type     Reason                Age                  From             Message   ----     ------                ----                 ----             -------   Warning  ErrorReconcilingNode  14m                  controlplane     nodeAdd: error adding node "sno-3.kni-qe-35.lab.eng.rdu2.redhat.com": could not find "k8s.ovn.org/node-subnets" annotation   Normal   RegisteredNode        14m                  node-controller  Node sno-3.kni-qe-35.lab.eng.rdu2.redhat.com event: Registered Node sno-3.kni-qe-35.lab.eng.rdu2.redhat.com in Controller   Warning  ErrorReconcilingNode  3m5s (x26 over 14m)  controlplane     [k8s.ovn.org/node-chassis-id annotation not found for node sno-3.kni-qe-35.lab.eng.rdu2.redhat.com, macAddress annotation not found for node "sno-3.kni-qe-35.lab.eng.rdu2.redhat.com" , k8s.ovn.org/l3-gateway-config annotation not found for node "sno-3.kni-qe-35.lab.eng.rdu2.redhat.com"] oc get pod -n openshift-multus NAME                                           READY   STATUS    RESTARTS   AGE multus-additional-cni-plugins-5s57s            0/1     Pending   0          20m multus-additional-cni-plugins-djhk9            1/1     Running   4          4h13m multus-admission-controller-6474b6c958-fb58z   2/2     Running   5          125m multus-fkzs6                                   1/1     Running   4          4h13m multus-gsqjv                                   0/1     Pending   0          20m network-metrics-daemon-h9rnm                   0/2     Pending   0          20m network-metrics-daemon-zg5rw                   2/2     Running   8          4h13m oc logs -n openshift-multus multus-fkzs6 2023-04-14T14:44:14+00:00 [cnibincopy] Successfully copied files in /usr/src/multus-cni/rhel8/bin/ to /host/opt/cni/bin/upgrade_ce6f17b3-e924-4515-a5b4-f636c3f30506 2023-04-14T14:44:14+00:00 [cnibincopy] Successfully moved files in /host/opt/cni/bin/upgrade_ce6f17b3-e924-4515-a5b4-f636c3f30506 to /host/opt/cni/bin/ 2023-04-14T14:44:14+00:00 WARN: {unknown parameter "-"} 2023-04-14T14:44:14+00:00 Entrypoint skipped copying Multus binary. 2023-04-14T14:44:14+00:00 Generating Multus configuration file using files in /host/var/run/multus/cni/net.d... 2023-04-14T14:44:14+00:00 Attempting to find master plugin configuration, attempt 0 2023-04-14T14:44:19+00:00 Attempting to find master plugin configuration, attempt 5 2023-04-14T14:44:20+00:00 Using MASTER_PLUGIN: 10-ovn-kubernetes.conf 2023-04-14T14:44:20+00:00 Nested capabilities string: 2023-04-14T14:44:20+00:00 Using /host/var/run/multus/cni/net.d/10-ovn-kubernetes.conf as a source to generate the Multus configuration 2023-04-14T14:44:20+00:00 Config file created @ /host/etc/cni/net.d/00-multus.conf { "cniVersion": "0.3.1", "name": "multus-cni-network", "type": "multus", "namespaceIsolation": true, "globalNamespaces": "default,openshift-multus,openshift-sriov-network-operator", "logLevel": "verbose", "binDir": "/opt/multus/bin", "readinessindicatorfile": "/var/run/multus/cni/net.d/10-ovn-kubernetes.conf", "kubeconfig": "/etc/kubernetes/cni/net.d/multus.d/multus.kubeconfig", "delegates": [ {"cniVersion":"0.4.0","name":"ovn-kubernetes","type":"ovn-k8s-cni-overlay","ipam":{},"dns":{},"logFile":"/var/log/ovn-kubernetes/ovn-k8s-cni-overlay.log","logLevel":"4","logfile-maxsize":100,"logfile-maxbackups":5,"logfile-maxage":5} ] } oc logs -n openshift-multus multus-gsqjv [kni@registry.kni-qe-31 kni-qe-35-ipv6.bak]$ oc logs -n openshift-multus multus-additional-cni-plugins-5s57s Defaulted container "kube-multus-additional-cni-plugins" out of: kube-multus-additional-cni-plugins, egress-router-binary-copy (init), cni-plugins (init), bond-cni-plugin (init), routeoverride-cni (init), whereabouts-cni-bincopy (init), whereabouts-cni (init) oc logs -n openshift-multus multus-additional-cni-plugins-djhk9 Defaulted container "kube-multus-additional-cni-plugins" out of: kube-multus-additional-cni-plugins, egress-router-binary-copy (init), cni-plugins (init), bond-cni-plugin (init), routeoverride-cni (init), whereabouts-cni-bincopy (init), whereabouts-cni (init) [kni@registry.kni-qe-31 kni-qe-35-ipv6.bak]$ oc logs -n openshift-multus multus-additional-cni-plugins-5s57s Defaulted container "kube-multus-additional-cni-plugins" out of: kube-multus-additional-cni-plugins, egress-router-binary-copy (init), cni-plugins (init), bond-cni-plugin (init), routeoverride-cni (init), whereabouts-cni-bincopy (init), whereabouts-cni (init) [kni@registry.kni-qe-31 kni-qe-35-ipv6.bak]$ oc logs -n openshift-multus multus-additional-cni-plugins-5s57s -c kube-multus-additional-cni-plugins [kni@registry.kni-qe-31 kni-qe-35-ipv6.bak]$ oc logs -n openshift-multus multus-additional-cni-plugins-5s57s -c cni-plugins [kni@registry.kni-qe-31 kni-qe-35-ipv6.bak]$ oc logs -n openshift-multus multus-additional-cni-plugins-5s57s -c bond-cni-plugin [kni@registry.kni-qe-31 kni-qe-35-ipv6.bak]$ oc logs -n openshift-multus multus-additional-cni-plugins-5s57s -c routeoverride-cni [kni@registry.kni-qe-31 kni-qe-35-ipv6.bak]$ oc logs -n openshift-multus multus-additional-cni-plugins-5s57s -c reabouts-cni-bincopy error: container reabouts-cni-bincopy is not valid for pod multus-additional-cni-plugins-5s57s [kni@registry.kni-qe-31 kni-qe-35-ipv6.bak]$ oc logs -n openshift-multus multus-additional-cni-plugins-5s57s -c whereabouts-cni SNO-2 Annotations: k8s.ovn.org/host-addresses: ["2620:52:0:165::41"] k8s.ovn.org/l3-gateway-config: {"default":{"mode":"shared","interface-id":"br-ex_sno-2","mac-address":"10:70:fd:64:e8:0b","ip-addresses":["2620:52:0:165::41/124"],"ip-ad... k8s.ovn.org/node-chassis-id: bd8cfb0b-3a50-4daa-978f-95355356684a k8s.ovn.org/node-gateway-router-lrp-ifaddr: {"ipv6":"fd98::2/64"} k8s.ovn.org/node-mgmt-port-mac-address: be:16:46:cc:fd:1a k8s.ovn.org/node-primary-ifaddr: {"ipv6":"2620:52:0:165::41/124"} k8s.ovn.org/node-subnets: {"default":["fd01:0:0:1::/64"]} machineconfiguration.openshift.io/controlPlaneTopology: SingleReplica machineconfiguration.openshift.io/currentConfig: rendered-master-3dfb0e3ab27e395ce05390080ea36638 machineconfiguration.openshift.io/desiredConfig: rendered-master-3dfb0e3ab27e395ce05390080ea36638 machineconfiguration.openshift.io/desiredDrain: uncordon-rendered-master-3dfb0e3ab27e395ce05390080ea36638 machineconfiguration.openshift.io/lastAppliedDrain: uncordon-rendered-master-3dfb0e3ab27e395ce05390080ea36638 machineconfiguration.openshift.io/lastSyncedControllerConfigResourceVersion: 13543 machineconfiguration.openshift.io/reason: machineconfiguration.openshift.io/state: Done sriovnetwork.openshift.io/state: Idle volumes.kubernetes.io/controller-managed-attach-detach: true SNO-3 (worker node) Annotations: k8s.ovn.org/node-gateway-router-lrp-ifaddr: {"ipv6":"fd98::3/64"} k8s.ovn.org/node-subnets: {"default":["fd01:0:0:2::/64"]} machineconfiguration.openshift.io/controlPlaneTopology: SingleReplica volumes.kubernetes.io/controller-managed-attach-detach: true 

      Expected results:

      Expect worker node to become Ready

      Additional info:

      must-gather
      http://10.1.101.1/4.13/must-gather/openshift_sno_ipv6_add_worker_not_ready_4.13.0-rc2.tar
      
      Reprinting Cluster State:
      When opening a support case, bugzilla, or issue please include the following summary data along with any other requested information:
      ClusterID: 4b1fcfb6-8e09-4aaf-924a-b14037a2736a
      ClusterVersion: Stable at "4.13.0-rc.2"
      ClusterOperators:
      	clusteroperator/dns is progressing: DNS "default" reports Progressing=True: "Have 1 available node-resolver pods, want 2."
      	clusteroperator/image-registry is progressing: NodeCADaemonProgressing: The daemon set node-ca is deploying node pods
      Progressing: All registry resources are removed
      	clusteroperator/machine-config is not available (Cluster not available for [{operator 4.13.0-rc.2}]: failed to apply machine config daemon manifests: error during waitForDaemonsetRollout: [timed out waiting for the condition, daemonset machine-config-daemon is not ready. status: (desired: 2, updated: 2, ready: 1, unavailable: 1)]) because Failed to resync 4.13.0-rc.2 because: failed to apply machine config daemon manifests: error during waitForDaemonsetRollout: [timed out waiting for the condition, daemonset machine-config-daemon is not ready. status: (desired: 2, updated: 2, ready: 1, unavailable: 1)]
      	clusteroperator/network is progressing: DaemonSet "/openshift-multus/multus-additional-cni-plugins" is not available (awaiting 1 nodes)
      DaemonSet "/openshift-multus/multus" is not available (awaiting 1 nodes)
      DaemonSet "/openshift-ovn-kubernetes/ovnkube-node" is not available (awaiting 1 nodes)
      DaemonSet "/openshift-multus/network-metrics-daemon" is not available (awaiting 1 nodes)
      	clusteroperator/node-tuning is progressing: Waiting for 1/2 Profiles to be applied
      
      
      
      oc get events -n openshift-multus
      LAST SEEN   TYPE      REASON             OBJECT                                    MESSAGE
      2m48s       Warning   FailedScheduling   pod/multus-additional-cni-plugins-5s57s   0/2 nodes are available: 1 Insufficient management.workload.openshift.io/cores. preemption: 0/2 nodes are available: 1 No preemption victims found for incoming pod, 1 node(s) didn't match Pod's node affinity/selector..
      131m        Normal    SuccessfulCreate   daemonset/multus-additional-cni-plugins   Created pod: multus-additional-cni-plugins-5s57s
      2m48s       Warning   FailedScheduling   pod/multus-gsqjv                          0/2 nodes are available: 1 Insufficient management.workload.openshift.io/cores. preemption: 0/2 nodes are available: 1 No preemption victims found for incoming pod, 1 node(s) didn't match Pod's node affinity/selector..
      131m        Normal    SuccessfulCreate   daemonset/multus                          Created pod: multus-gsqjv
      2m48s       Warning   FailedScheduling   pod/network-metrics-daemon-h9rnm          0/2 nodes are available: 1 Insufficient management.workload.openshift.io/cores. preemption: 0/2 nodes are available: 1 No preemption victims found for incoming pod, 1 node(s) didn't match Pod's node affinity/selector..
      131m        Normal    SuccessfulCreate   daemonset/network-metrics-daemon          Created pod: network-metrics-daemon-h9rnm
      
      
      
      # sno-3 node
      sh-5.1# cat /etc/resolv.conf
      # Generated by NetworkManager
      search kni-qe-35.lab.eng.rdu2.redhat.com
      nameserver 2620:52:0:aa0::dead:beef  

            dosmith Douglas Smith
            mlammon@redhat.com Mike Lammon
            Weibin Liang Weibin Liang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: