Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-34492

[Upgrade] kube-apiserver stuck in updating versions when upgrade from old releases

XMLWordPrintable

    • Critical
    • No
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Hide
      For clusters created before OpenShift 4.8, the stored versions of the resources flowschemas.flowcontrol.apiserver.k8s.io and prioritylevelconfigurations.flowcontrol.apiserver.k8s.io will be automatically migrated to a newer version in preparation for the removal of the v1alpha1 version in OpenShift 4.16. No administrator action is required.
      Show
      For clusters created before OpenShift 4.8, the stored versions of the resources flowschemas.flowcontrol.apiserver.k8s.io and prioritylevelconfigurations.flowcontrol.apiserver.k8s.io will be automatically migrated to a newer version in preparation for the removal of the v1alpha1 version in OpenShift 4.16. No administrator action is required.

      This is a clone of issue OCPBUGS-34408. The following is the description of the original issue:

      This is a clone of issue OCPBUGS-33963. The following is the description of the original issue:

      Description of problem:

      kube-apiserver was stuck in updating versions when upgrade from 4.1 to 4.16 with AWS ipi installation
          

      Version-Release number of selected component (if applicable):

      4.16.0-0.nightly-2024-05-01-111315
          

      How reproducible:

          always
          

      Steps to Reproduce:

          1. IPI Install an AWS 4.1 cluster, upgrade it to 4.16
          2. Upgrade was stuck in 4.15 to 4.16, waiting on etcd, kube-apiserver updating
          
          

      Actual results:

         1. Upgrade was stuck in 4.15 to 4.16, waiting on etcd, kube-apiserver updating
         $ oc get clusterversion
      NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
      version   4.15.0-0.nightly-2024-05-16-091947   True        True          39m     Working towards 4.16.0-0.nightly-2024-05-16-092402: 111 of 894 done (12% complete)
      
          

      Expected results:

      Upgrade should be successful.
          

      Additional info:

      Must-gather: https://gcsweb-qe-private-deck-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/qe-private-deck/logs/periodic-ci-openshift-openshift-tests-private-release-4.16-amd64-nightly-4.16-upgrade-from-stable-4.1-aws-ipi-f30/1791391925467615232/artifacts/aws-ipi-f30/gather-must-gather/artifacts/must-gather.tar
      
      Checked the must-gather logs, 
      $ omg get clusterversion -oyaml
      ...
      conditions:
        - lastTransitionTime: '2024-05-17T09:35:29Z'
          message: Done applying 4.15.0-0.nightly-2024-05-16-091947
          status: 'True'
          type: Available
        - lastTransitionTime: '2024-05-18T06:31:41Z'
          message: 'Multiple errors are preventing progress:
      
            * Cluster operator kube-apiserver is updating versions
      
            * Could not update flowschema "openshift-etcd-operator" (82 of 894): the server
            does not recognize this resource, check extension API servers'
          reason: MultipleErrors
          status: 'True'
          type: Failing
      
      $ omg get co | grep -v '.*True.*False.*False'
      NAME                                      VERSION                             AVAILABLE  PROGRESSING  DEGRADED  SINCE
      kube-apiserver                            4.15.0-0.nightly-2024-05-16-091947  True       True         False     10m
      
      $ omg get pod -n openshift-kube-apiserver
      NAME                                               READY  STATUS     RESTARTS  AGE
      installer-40-ip-10-0-136-146.ec2.internal          0/1    Succeeded  0         2h29m
      installer-41-ip-10-0-143-206.ec2.internal          0/1    Succeeded  0         2h25m
      installer-43-ip-10-0-154-116.ec2.internal          0/1    Succeeded  0         2h22m
      installer-44-ip-10-0-154-116.ec2.internal          0/1    Succeeded  0         1h35m
      kube-apiserver-guard-ip-10-0-136-146.ec2.internal  1/1    Running    0         2h24m
      kube-apiserver-guard-ip-10-0-143-206.ec2.internal  1/1    Running    0         2h24m
      kube-apiserver-guard-ip-10-0-154-116.ec2.internal  0/1    Running    0         2h24m
      kube-apiserver-ip-10-0-136-146.ec2.internal        5/5    Running    0         2h27m
      kube-apiserver-ip-10-0-143-206.ec2.internal        5/5    Running    0         2h24m
      kube-apiserver-ip-10-0-154-116.ec2.internal        4/5    Running    17        1h34m
      revision-pruner-39-ip-10-0-136-146.ec2.internal    0/1    Succeeded  0         2h44m
      revision-pruner-39-ip-10-0-143-206.ec2.internal    0/1    Succeeded  0         2h50m
      revision-pruner-39-ip-10-0-154-116.ec2.internal    0/1    Succeeded  0         2h52m
      revision-pruner-40-ip-10-0-136-146.ec2.internal    0/1    Succeeded  0         2h29m
      revision-pruner-40-ip-10-0-143-206.ec2.internal    0/1    Succeeded  0         2h29m
      revision-pruner-40-ip-10-0-154-116.ec2.internal    0/1    Succeeded  0         2h29m
      revision-pruner-41-ip-10-0-136-146.ec2.internal    0/1    Succeeded  0         2h26m
      revision-pruner-41-ip-10-0-143-206.ec2.internal    0/1    Succeeded  0         2h26m
      revision-pruner-41-ip-10-0-154-116.ec2.internal    0/1    Succeeded  0         2h26m
      revision-pruner-42-ip-10-0-136-146.ec2.internal    0/1    Succeeded  0         2h24m
      revision-pruner-42-ip-10-0-143-206.ec2.internal    0/1    Succeeded  0         2h23m
      revision-pruner-42-ip-10-0-154-116.ec2.internal    0/1    Succeeded  0         2h23m
      revision-pruner-43-ip-10-0-136-146.ec2.internal    0/1    Succeeded  0         2h23m
      revision-pruner-43-ip-10-0-143-206.ec2.internal    0/1    Succeeded  0         2h23m
      revision-pruner-43-ip-10-0-154-116.ec2.internal    0/1    Succeeded  0         2h23m
      revision-pruner-44-ip-10-0-136-146.ec2.internal    0/1    Succeeded  0         1h35m
      revision-pruner-44-ip-10-0-143-206.ec2.internal    0/1    Succeeded  0         1h35m
      revision-pruner-44-ip-10-0-154-116.ec2.internal    0/1    Succeeded  0         1h35m
      
      Checked the kube-apiserver kube-apiserver-ip-10-0-154-116.ec2.internal logs, seems something wring with informers, 
      $ grep 'informers not started yet' current.log  | wc -l
      360
      
      $ grep 'informers not started yet' current.log 
      2024-05-18T06:34:51.888804183Z [-]informer-sync failed: 4 informers not started yet: [*v1.PriorityLevelConfiguration *v1.Secret *v1.FlowSchema *v1.ConfigMap]
      2024-05-18T06:34:51.889350484Z [-]informer-sync failed: 4 informers not started yet: [*v1.PriorityLevelConfiguration *v1.FlowSchema *v1.Secret *v1.ConfigMap]
      2024-05-18T06:34:52.004808401Z [-]informer-sync failed: 2 informers not started yet: [*v1.FlowSchema *v1.PriorityLevelConfiguration]
      2024-05-18T06:34:52.095516498Z [-]informer-sync failed: 2 informers not started yet: [*v1.PriorityLevelConfiguration *v1.FlowSchema]
      ...
      
      
          

            bluddy Ben Luddy
            openshift-crt-jira-prow OpenShift Prow Bot
            Ke Wang Ke Wang
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: