Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-50603

Validating admission policy blocks KAS bootstrap feature gate apply

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • 4.18, 4.19.z, 4.20.0, 4.21
    • HyperShift
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • None
    • Rejected
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      The HyperShift installed 'config' ValidatingAdmissionPolicy blocks the Kubernetes API server (KAS) apply-bootstrap container from applying feature gate changes to the 'cluster' 'featuregate' resource. The HyperShift control plane operator will continue the rollout of the control plane components which will either start running with an unchanged list of feature gates or will crash because the feature gate version won't match the expected component version.

      Version-Release number of selected component (if applicable):

      4.18.0-rc.8    

      How reproducible:

      Always when changing feature gates

      Steps to Reproduce:

          1. Create a cluster that sets .spec.configuration.featureGate in the HostedCluster spec.
          2. Update .spec.configuration.featureGate in the cluster's HostedCluster spec.
          

      Actual results:

      Validating admission policy blocks KAS apply-bootstrap container from applying feature gate changes.

      Expected results:

      Validating admission policy allows the KAS apply-bootstrap container without having to temporarily delete the 'config' ValidatingAdmissionPolicy.

      Additional info:

      ValidatingAdmissionPolicy causing the problem:
      
      apiVersion: admissionregistration.k8s.io/v1
      kind: ValidatingAdmissionPolicy
      metadata:
        creationTimestamp: "2025-02-09T18:58:30Z"
        generation: 1
        labels:
          hypershift.openshift.io/managed: "true"
        name: config
        resourceVersion: "1350"
        uid: 354ceb20-e118-4024-ab6a-22ab492689e5
      spec:
        failurePolicy: Fail
        matchConstraints:
          matchPolicy: Equivalent
          namespaceSelector: {}
          objectSelector: {}
          resourceRules:
          - apiGroups:
            - config.openshift.io
            apiVersions:
            - v1
            operations:
            - UPDATE
            - DELETE
            resources:
            - apiservers
            - authentications
            - featuregates
            - images
            - imagecontentpolicies
            - ingresses
            - proxies
            - schedulers
            - networks
            - oauths
            scope: '*'
        validations:
        - expression: request.userInfo.username in ['system:hosted-cluster-config'] || (has(object.spec)
            && has(oldObject.spec) && object.spec == oldObject.spec)
          message: This resource cannot be created, updated, or deleted. Please ask your
            administrator to modify the resource in the HostedCluster object.
          reason: Invalid
      status:
        observedGeneration: 1
        typeChecking: {}
      
      Example control plane problems:
      
      [prestage-dal10-carrier100] rtheis@prestage-mon01-carrier1-worker-1021:~$ kubectl get pods -n master-cukfgu110eiif3ngu98g | grep ago
      cluster-image-registry-operator-d78bbf946-vz6bw        1/2     CrashLoopBackOff   15 (82s ago)     106m
      cluster-network-operator-55bbd49b5d-pzg6b              3/3     Running            15 (5m42s ago)   106m
      cluster-node-tuning-operator-6df7785f4f-fr5l2          0/1     CrashLoopBackOff   21 (84s ago)     106m
      dns-operator-5cf4fc8595-6xxpp                          0/1     CrashLoopBackOff   21 (78s ago)     106m
      ingress-operator-5ddb8868cf-fkkwt                      1/2     CrashLoopBackOff   21 (115s ago)    106m
      [prestage-dal10-carrier100] rtheis@prestage-mon01-carrier1-worker-1021:~$ 
      
      Example control plane pod crash:
      
      [prestage-dal10-carrier100] rtheis@prestage-mon01-carrier1-worker-1021:~$ kubectl logs -n master-cukfgu110eiif3ngu98g cluster-image-registry-operator-d78bbf946-vz6bw  -p --tail=10
      Defaulted container "cluster-image-registry-operator" out of: cluster-image-registry-operator, client-token-minter
      E0210 14:10:26.537119       1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"4.18.0-rc.8\" in featuregates.config.openshift.io/cluster" logger="UnhandledError"
      E0210 14:10:26.859693       1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"4.18.0-rc.8\" in featuregates.config.openshift.io/cluster" logger="UnhandledError"
      E0210 14:10:27.502210       1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"4.18.0-rc.8\" in featuregates.config.openshift.io/cluster" logger="UnhandledError"
      E0210 14:10:28.784156       1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"4.18.0-rc.8\" in featuregates.config.openshift.io/cluster" logger="UnhandledError"
      E0210 14:10:31.346885       1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"4.18.0-rc.8\" in featuregates.config.openshift.io/cluster" logger="UnhandledError"
      E0210 14:10:36.469479       1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"4.18.0-rc.8\" in featuregates.config.openshift.io/cluster" logger="UnhandledError"
      E0210 14:10:46.712338       1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"4.18.0-rc.8\" in featuregates.config.openshift.io/cluster" logger="UnhandledError"
      E0210 14:11:07.195708       1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"4.18.0-rc.8\" in featuregates.config.openshift.io/cluster" logger="UnhandledError"
      E0210 14:11:26.215863       1 starter.go:91] timed out waiting for FeatureGate detection
      W0210 14:11:26.216539       1 builder.go:136] graceful termination failed, controllers failed with error: timed out waiting for FeatureGate detection
      [prestage-dal10-carrier100] rtheis@prestage-mon01-carrier1-worker-1021:~$ 
      

       

              Unassigned Unassigned
              richardtheis Richard Theis
              None
              None
              XiuJuan Wang XiuJuan Wang
              None
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated: