Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-17701

EUS upgrade from 4.12 ->4.14 is not working

XMLWordPrintable

    • No
    • MCO Sprint 240
    • 1
    • Approved
    • False
    • Hide

      Considering we are seeing this bug, it is possible that other cluster may see this issue too when upgrading from EUS.

      Show
      Considering we are seeing this bug, it is possible that other cluster may see this issue too when upgrading from EUS.

      Description of problem:

       On attempting to perform EUS->EUS upgrade from 4.12.z->4.14 (CI builds), I am seeing consistently that after upgrade OCP to 4.14, worker machine configpool goes to degraded state, complaining about {noformat}message: 'Node c01-dbn-412-tzm44-worker-0-7w6wg is reporting: "failed to run
              nmstatectl: fork/exec /run/machine-config-daemon-bin/nmstatectl: no such file
              or directory", Node c01-dbn-412-tzm44-worker-0-cmqsl is reporting: "failed
              to run nmstatectl: fork/exec /run/machine-config-daemon-bin/nmstatectl: no
              such file or directory", Node c01-dbn-412-tzm44-worker-0-qrp6v is reporting:
              "failed to run nmstatectl: fork/exec /run/machine-config-daemon-bin/nmstatectl:
              no such file or directory"'
      {noformat}. And then clusterversion reports error:
      {noformat}
      [cloud-user@ocp-psi-executor dbasunag]$ oc get clusterversion
      NAME      VERSION                         AVAILABLE   PROGRESSING   SINCE   STATUS
      version   4.13.0-0.ci-2023-08-14-110508   True        True          125m    Unable to apply 4.14.0-0.ci-2023-08-14-152624: wait has exceeded 40 minutes for these operators: machine-config
      [cloud-user@ocp-psi-executor dbasunag]$
      {noformat}
      This is consistently reproducible in clusters with knmstate installed.
      

      Version-Release number of selected component (if applicable):

      4.12.29 -> 4.13.0-0.ci-2023-08-14-110508->4.14.0-0.ci-2023-08-14-152624
      

      How reproducible:

      100%
      

      Steps to Reproduce:

      1. Perform EUS upgrade on a cluster with CNV, ODF, Knmstate
      2. After pausing worker mcp, upgraded OCP, ODF, CNV, KNMstate to 4.13 - everything worked fine
      3. After upgrading OCP to 4.14, when master mcp is updated, worker mcp went to degraded state and clusterversion eventually reported error (all the master nodes were updated)
      

      Actual results:

      [cloud-user@ocp-psi-executor dbasunag]$ oc get co
      NAME                                       VERSION                         AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
      authentication                             4.14.0-0.ci-2023-08-14-152624   True        False         False      9h      
      baremetal                                  4.14.0-0.ci-2023-08-14-152624   True        False         False      2d23h   
      cloud-controller-manager                   4.14.0-0.ci-2023-08-14-152624   True        False         False      2d23h   
      cloud-credential                           4.14.0-0.ci-2023-08-14-152624   True        False         False      2d23h   
      cluster-autoscaler                         4.14.0-0.ci-2023-08-14-152624   True        False         False      2d23h   
      config-operator                            4.14.0-0.ci-2023-08-14-152624   True        False         False      2d23h   
      console                                    4.14.0-0.ci-2023-08-14-152624   True        False         False      2d22h   
      control-plane-machine-set                  4.14.0-0.ci-2023-08-14-152624   True        False         False      2d23h   
      csi-snapshot-controller                    4.14.0-0.ci-2023-08-14-152624   True        False         False      2d23h   
      dns                                        4.14.0-0.ci-2023-08-14-152624   True        False         False      2d23h   
      etcd                                       4.14.0-0.ci-2023-08-14-152624   True        False         False      2d23h   
      image-registry                             4.14.0-0.ci-2023-08-14-152624   True        False         False      2d22h   
      ingress                                    4.14.0-0.ci-2023-08-14-152624   True        False         False      2d22h   
      insights                                   4.14.0-0.ci-2023-08-14-152624   True        False         False      2d22h   
      kube-apiserver                             4.14.0-0.ci-2023-08-14-152624   True        False         False      2d22h   
      kube-controller-manager                    4.14.0-0.ci-2023-08-14-152624   True        False         False      2d22h   
      kube-scheduler                             4.14.0-0.ci-2023-08-14-152624   True        False         False      2d22h   
      kube-storage-version-migrator              4.14.0-0.ci-2023-08-14-152624   True        False         False      2d23h   
      machine-api                                4.14.0-0.ci-2023-08-14-152624   True        False         False      2d22h   
      machine-approver                           4.14.0-0.ci-2023-08-14-152624   True        False         False      2d23h   
      machine-config                             4.13.0-0.ci-2023-08-14-110508   True        True          True       2d23h   Unable to apply 4.14.0-0.ci-2023-08-14-152624: error during syncRequiredMachineConfigPools: [context deadline exceeded, failed to update clusteroperator: [client rate limiter Wait returned an error: context deadline exceeded, error MachineConfigPool worker is not ready, retrying. Status: (pool degraded: true total: 3, ready 0, updated: 0, unavailable: 3)]]
      marketplace                                4.14.0-0.ci-2023-08-14-152624   True        False         False      2d23h   
      monitoring                                 4.14.0-0.ci-2023-08-14-152624   True        False         False      2d22h   
      network                                    4.14.0-0.ci-2023-08-14-152624   True        False         False      2d23h   
      node-tuning                                4.14.0-0.ci-2023-08-14-152624   True        False         False      95m     
      openshift-apiserver                        4.14.0-0.ci-2023-08-14-152624   True        False         False      2d22h   
      openshift-controller-manager               4.14.0-0.ci-2023-08-14-152624   True        False         False      2d22h   
      openshift-samples                          4.14.0-0.ci-2023-08-14-152624   True        False         False      98m     
      operator-lifecycle-manager                 4.14.0-0.ci-2023-08-14-152624   True        False         False      2d23h   
      operator-lifecycle-manager-catalog         4.14.0-0.ci-2023-08-14-152624   True        False         False      2d23h   
      operator-lifecycle-manager-packageserver   4.14.0-0.ci-2023-08-14-152624   True        False         False      2d22h   
      service-ca                                 4.14.0-0.ci-2023-08-14-152624   True        False         False      2d23h   
      storage                                    4.14.0-0.ci-2023-08-14-152624   True        False         False      2d23h   
      [cloud-user@ocp-psi-executor dbasunag]$ 
      [cloud-user@ocp-psi-executor dbasunag]$ oc get mcp
      NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
      master   rendered-master-693b054330417fe5e098b58716603fc8   True      False      False      3              3                   3                     0                      2d23h
      worker   rendered-worker-b2f5a9084e9919b4c1c491658c73bce5   False     False      True       3              0                   0                     3                      2d23h
      [cloud-user@ocp-psi-executor dbasunag]$
      [cloud-user@ocp-psi-executor dbasunag]$ oc get node
      NAME                               STATUS   ROLES                  AGE     VERSION
      c01-dbn-412-tzm44-master-0         Ready    control-plane,master   2d23h   v1.27.4+deb2c60
      c01-dbn-412-tzm44-master-1         Ready    control-plane,master   2d23h   v1.27.4+deb2c60
      c01-dbn-412-tzm44-master-2         Ready    control-plane,master   2d23h   v1.27.4+deb2c60
      c01-dbn-412-tzm44-worker-0-7w6wg   Ready    worker                 2d22h   v1.25.11+1485cc9
      c01-dbn-412-tzm44-worker-0-cmqsl   Ready    worker                 2d22h   v1.25.11+1485cc9
      c01-dbn-412-tzm44-worker-0-qrp6v   Ready    worker                 2d22h   v1.25.11+1485cc9
      [cloud-user@ocp-psi-executor dbasunag]$ 
      
      

      Expected results:

      EUS upgrade should work without error
      

      Additional info:

      Must-gather can be found here: https://drive.google.com/drive/folders/1SCZoYpGiRpOteTM-sTLmbfgr3hqsICVO?usp=drive_link
      

            rhn-engineering-skumari Sinny Kumari
            rhn-support-dbasunag Debarati Basu-Nag
            Rio Liu Rio Liu
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: