Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-31007

Failed to update from ocp4.14.16 to ocp 4.15.2

XMLWordPrintable

    • No
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      using the fast-4.14 channel, the update possess from ocp-4.14.16 to ocp-4.15.2 fails although it is supported.

      Version-Release number of selected component (if applicable):

      [kni@ocp-edge77 ~]$ oc version
      Client Version: 4.14.14
      Kustomize Version: v5.0.1
      Server Version: 4.14.16
      Kubernetes Version: v1.28.6+6216ea1
      

      How reproducible:

      TBD - I tried only once  

      Steps to Reproduce:

          1.Deploy an ocp-4.14.14 hub cluster, I was using this job for that:https://auto-jenkins-csb-kniqe.apps.ocp-c1.prod.psi.redhat.com/job/CI/job/job-runner/2440/
      
      
          2. update to ocp-4.14.16 version (I don't remember if used stable-4.14 or fast-4.14 for that but it ended successfully after about 1.5 hours ) 
          
          3.using the fast-15 chanel, update to ocp-4.15.2 by using:
      oc adm upgrade --to=4.15.2        

      Actual results:

          upgrade has failed, for 
      1. 
      (.venv) [kni@ocp-edge77 ocp-edge-auto_cluster]$ oc adm upgrade
      Failing=True:  Reason: ClusterOperatorDegraded
        Message: Cluster operator etcd is degradedinfo: An upgrade is in progress. Unable to apply 4.15.2: wait has exceeded 40 minutes for these operators: etcdUpgradeable=False  Reason: PoolUpdating
        Message: Cluster operator machine-config should not be upgraded between minor versions: One or more machine config pools are updating, please see `oc get mcp` for further detailsUpstream is unset, so the cluster will use an appropriate default.
      Channel: fast-4.15 (available channels: candidate-4.15, candidate-4.16, fast-4.15)
      No updates available. You may still upgrade to a specific release image with --to-image or wait for new updates to be available.
      (.venv) [kni@ocp-edge77 ocp-edge-auto_cluster]$ 
      
      
      
      
      2.not updated clusteroperators, and progressing clusteroperators
      (.venv) [kni@ocp-edge77 ocp-edge-auto_cluster]$ oc get co 
      NAME                                       VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
      authentication                             4.15.2    True        False         False      15h     
      baremetal                                  4.15.2    True        False         False      45h     
      cloud-controller-manager                   4.15.2    True        False         False      45h     
      cloud-credential                           4.15.2    True        False         False      46h     
      cluster-autoscaler                         4.15.2    True        False         False      45h     
      config-operator                            4.15.2    True        False         False      45h     
      console                                    4.15.2    True        False         False      24h     
      control-plane-machine-set                  4.15.2    True        False         False      45h     
      csi-snapshot-controller                    4.15.2    True        False         False      45h     
      dns                                        4.15.2    True        False         False      45h     
      etcd                                       4.15.2    True        False         True       15h     EtcdEndpointsDegraded: EtcdEndpointsController can't evaluate whether quorum is safe: etcd cluster has quorum of 2 and 2 healthy members which is not fault tolerant: [{Member:ID:5282880766147895622 name:"master-0-2" peerURLs:"https://192.168.123.129:2380" clientURLs:"https://192.168.123.129:2379"  Healthy:false Took: Error:create client failure: failed to make etcd client for endpoints [https://192.168.123.129:2379]: context deadline exceeded} {Member:ID:9307753032987554926 name:"master-0-0" peerURLs:"https://192.168.123.67:2380" clientURLs:"https://192.168.123.67:2379"  Healthy:true Took:8.703553ms Error:<nil>} {Member:ID:12721024147063591356 name:"master-0-1" peerURLs:"https://192.168.123.133:2380" clientURLs:"https://192.168.123.133:2379"  Healthy:true Took:1.056064ms Error:<nil>}]...
      image-registry                             4.15.2    True        False         False      23h     
      ingress                                    4.15.2    True        False         False      45h     
      insights                                   4.15.2    True        False         False      45h     
      kube-apiserver                             4.15.2    True        False         False      45h     
      kube-controller-manager                    4.15.2    True        False         False      45h     
      kube-scheduler                             4.15.2    True        False         False      45h     
      kube-storage-version-migrator              4.15.2    True        False         False      27h     
      machine-api                                4.15.2    True        False         False      45h     
      machine-approver                           4.15.2    True        False         False      45h     
      machine-config                             4.14.16   True        True          True       45h     Unable to apply 4.15.2: error during syncRequiredMachineConfigPools: [context deadline exceeded, failed to update clusteroperator: [client rate limiter Wait returned an error: context deadline exceeded, MachineConfigPool master has not progressed to latest configuration: controller version mismatch for rendered-master-2f85e15567c3218093da6886ffb92d72 expected 6eb0e07f062c0c06965da711454d1eaa12934f78 has c15ec9ce2646231f4c227584d29c4487371febf9: 1 (ready 0) out of 3 nodes are updating to latest configuration rendered-master-2bc0ebd35ed0b7bd6220143c73032eb0, retrying]]
      marketplace                                4.15.2    True        False         False      45h     
      monitoring                                 4.15.2    True        False         False      6h58m   
      network                                    4.15.2    True        True          False      45h     DaemonSet "/openshift-multus/network-metrics-daemon" is not available (awaiting 1 nodes)...
      node-tuning                                4.15.2    True        False         False      21h     
      openshift-apiserver                        4.15.2    True        False         True       45h     APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-apiserver (container is waiting in apiserver-585cb497f6-xc5zx pod)
      openshift-controller-manager               4.15.2    True        False         False      45h     
      openshift-samples                          4.15.2    True        False         False      21h     
      operator-lifecycle-manager                 4.15.2    True        False         False      45h     
      operator-lifecycle-manager-catalog         4.15.2    True        False         False      45h     
      operator-lifecycle-manager-packageserver   4.15.2    True        False         False      7h3m    
      service-ca                                 4.15.2    True        False         False      45h     
      storage                                    4.15.2    True        False         False      45h     
      (.venv) [kni@ocp-edge77 ocp-edge-auto_cluster]$ 
      
      
      3. 
      (.venv) [kni@ocp-edge77 ocp-edge-auto_cluster]$ oc get mcp
      NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
      master   rendered-master-2f85e15567c3218093da6886ffb92d72   False     True       False      3              0                   1                     0                      45h
      worker   rendered-worker-c469c622a13c555dabb8115bebe4aa4a   True      False      False      0              0                   0                     0                      45h
      
      
      

      Expected results:

         All cluster operators ate updated to ocp-4.15.2

      Additional info:

          

              Unassigned Unassigned
              rhn-support-gamado Gal Amado
              Jia Liu Jia Liu
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: