Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-35795

updating 4.15.0 to latest nightly 4.15.18 fails

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Obsolete
    • Icon: Major Major
    • None
    • 4.15.z
    • oc / update
    • None
    • No
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      failed update to latest nightly

      Version-Release number of selected component (if applicable):

      [kni@ocp-edge64 ~]$ oc version
      Client Version: 4.15.0
      Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
      Server Version: 4.15.18
      Kubernetes Version: v1.28.10+a2c84a5

      How reproducible:

      TBD

      Steps to Reproduce:

      1.Deploy a hub cluster with 3 masters and on it a hosted cluster with 6 worker nodes, ocp4.15.0, and ACM2.9 I used those jobs for deployment: 
      (some envir noise, so it ran in few jobs):
      a. https://auto-jenkins-csb-kniqe.apps.ocp-c1.prod.psi.redhat.com/job/CI/job/job-runner/3059/
      b. https://auto-jenkins-csb-kniqe.apps.ocp-c1.prod.psi.redhat.com/job/CI/job/job-runner/3064/
      c. https://auto-jenkins-csb-kniqe.apps.ocp-c1.prod.psi.redhat.com/job/CI/job/job-runner/3067/
      
        2.update test for ocp 4.15.0->4.14.18, failed on timeout :
      https://auto-jenkins-csb-kniqe.apps.ocp-c1.prod.psi.redhat.com/job/Private_Folders/job/gamado/job/WIP-ocp-edge-auto-tests/55/testReport/deployment.upgrade/test_upgrade/test_hub_cluster_upgrade_quay_io_openshift_release_dev_ocp_release_4_15_18_x86_64_/
      
      
      
      Actual results:
      
      3 failed cluster operators: openshift-controller-manager, network, machine-config .
      
      [kni@ocp-edge64 ~]$ oc adm upgrade
      Failing=True:  Reason: ClusterOperatorDegraded
        Message: Cluster operator machine-config is degradedError while reconciling 4.15.18: the cluster operator machine-config is degradedUpgradeable=False  Reason: MultipleReasons
        Message: Cluster should not be upgraded between minor versions for multiple reasons: AdminAckRequired,PoolUpdating
        * Kubernetes 1.29 and therefore OpenShift 4.16 remove several APIs which require admin consideration. Please see the knowledge article https://access.redhat.com/articles/7031404 for details and instructions.
        * Cluster operator machine-config should not be upgraded between minor versions: One or more machine config pools are updating, please see `oc get mcp` for further detailsUpstream is unset, so the cluster will use an appropriate default.
      Channel: stable-4.14
      warning: Cannot display available updates:
        Reason: VersionNotFound
        Message: Unable to retrieve available updates: currently reconciling cluster version 4.15.18 not found in the "stable-4.14" channel[kni@ocp-edge64 ~]$ 
      
      
      [kni@ocp-edge64 ~]$ oc get co -A
      NAME                                       VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
      authentication                             4.15.18   True        False         False      9h      
      baremetal                                  4.15.18   True        False         False      28h     
      cloud-controller-manager                   4.15.18   True        False         False      28h     
      cloud-credential                           4.15.18   True        False         False      28h     
      cluster-autoscaler                         4.15.18   True        False         False      28h     
      config-operator                            4.15.18   True        False         False      28h     
      console                                    4.15.18   True        False         False      21h     
      control-plane-machine-set                  4.15.18   True        False         False      28h     
      csi-snapshot-controller                    4.15.18   True        False         False      28h     
      dns                                        4.15.18   True        False         False      28h     
      etcd                                       4.15.18   True        False         False      28h     
      image-registry                             4.15.18   True        False         False      21h     
      ingress                                    4.15.18   True        False         False      28h     
      insights                                   4.15.18   True        False         False      28h     
      kube-apiserver                             4.15.18   True        False         False      28h     
      kube-controller-manager                    4.15.18   True        False         False      28h     
      kube-scheduler                             4.15.18   True        False         False      28h     
      kube-storage-version-migrator              4.15.18   True        False         False      21h     
      machine-api                                4.15.18   True        False         False      28h     
      machine-approver                           4.15.18   True        False         False      28h     
      machine-config                             4.15.18   True        False         True       28h     Failed to resync 4.15.18 because: error during syncRequiredMachineConfigPools: [context deadline exceeded, error required MachineConfigPool master is not ready, retrying. Status: (total: 3, ready 0, updated: 3, unavailable: 3, degraded: 0)]
      marketplace                                4.15.18   True        False         False      28h     
      monitoring                                 4.15.18   True        False         False      28h     
      network                                    4.15.18   True        True          False      28h     DaemonSet "/openshift-multus/network-metrics-daemon" is not available (awaiting 1 nodes)...
      node-tuning                                4.15.18   True        False         False      22h     
      openshift-apiserver                        4.15.18   True        False         False      5h12m   
      openshift-controller-manager               4.15.18   True        True          False      28h     Progressing: deployment/controller-manager: updated replicas is 1, desired replicas is 3...
      openshift-samples                          4.15.18   True        False         False      22h     
      operator-lifecycle-manager                 4.15.18   True        False         False      28h     
      operator-lifecycle-manager-catalog         4.15.18   True        False         False      28h     
      operator-lifecycle-manager-packageserver   4.15.18   True        False         False      28h     
      service-ca                                 4.15.18   True        False         False      28h     
      storage                                    4.15.18   True        False         False      28h     
      
      
      [kni@ocp-edge64 ~]$ oc get mcp
      NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
      master   rendered-master-cc202996373d1de3b82d77374849f82f   False     True       False      3              0                   3                     0                      28h
      worker   rendered-worker-480b6e1544d2b557270da9caca3dfebd   True      False      False      0              0                   0                     0                      28h
      [kni@ocp-edge64 ~]$ 
      
      [kni@ocp-edge64 ~]$ oc get clusterversion
      NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
      version   4.15.18   True        False         21h     Error while reconciling 4.15.18: the cluster operator machine-config is degraded
      
       

      Expected results:

      successful update 

      Additional info:

       

       

              pratikam Pratik Mahajan
              rhn-support-gamado Gal Amado
              Yang Yang Yang Yang
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: