Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-33480

OpenShift cluster feature gate only contains latest version which impacts cluster operator availability

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • 4.14, 4.14.0, 4.15, 4.15.0, 4.16, 4.16.0
    • HyperShift
    • Moderate
    • No
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      OpenShift cluster network operator may crash during a cluster version update.

      Version-Release number of selected component (if applicable):

      OpenShift version 4.14 and later    

      How reproducible:

      Issue is not easy to reproduce during a normal and successful version update.

      Steps to Reproduce:

      Start an OpenShift patch version update. Once the first Kubernetes API server pod has been started on the new version, the cluster feature gate status will only have the new version. Restart the OpenShift network operator pod and it will crash with an error similar to the following:
      
      E0429 19:36:24.460120       1 simple_featuregate_reader.go:290] cluster failed with : unable to determine features: missing desired version "4.14.16" in featuregates.config.openshift.io/cluster
      W0429 19:36:43.501889       1 builder.go:109] graceful termination failed, controllers failed with error: failed to add controllers to manager: timed out waiting for FeatureGate detection

      Actual results:

      OpenShift cluster network operator pod may crash if there's a feature gate version mismatch.

      Expected results:

      OpenShift cluster network operator pod won't crash.

      Additional info:

      See https://ibm-argonauts.slack.com/archives/C01C8502FMM/p1714551198353339 for more information and initial discussion.

              cewong@redhat.com Cesar Wong
              richardtheis Richard Theis
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: