Description of problem:
The HyperShift installed 'config' ValidatingAdmissionPolicy blocks the Kubernetes API server (KAS) apply-bootstrap container from applying feature gate changes to the 'cluster' 'featuregate' resource. The HyperShift control plane operator will continue the rollout of the control plane components which will either start running with an unchanged list of feature gates or will crash because the feature gate version won't match the expected component version.
Version-Release number of selected component (if applicable):
4.18.0-rc.8
How reproducible:
Always when changing feature gates
Steps to Reproduce:
1. Create a cluster that sets .spec.configuration.featureGate in the HostedCluster spec.
2. Update .spec.configuration.featureGate in the cluster's HostedCluster spec.
Actual results:
Validating admission policy blocks KAS apply-bootstrap container from applying feature gate changes.
Expected results:
Validating admission policy allows the KAS apply-bootstrap container without having to temporarily delete the 'config' ValidatingAdmissionPolicy.
Additional info:
ValidatingAdmissionPolicy causing the problem:
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
creationTimestamp: "2025-02-09T18:58:30Z"
generation: 1
labels:
hypershift.openshift.io/managed: "true"
name: config
resourceVersion: "1350"
uid: 354ceb20-e118-4024-ab6a-22ab492689e5
spec:
failurePolicy: Fail
matchConstraints:
matchPolicy: Equivalent
namespaceSelector: {}
objectSelector: {}
resourceRules:
- apiGroups:
- config.openshift.io
apiVersions:
- v1
operations:
- UPDATE
- DELETE
resources:
- apiservers
- authentications
- featuregates
- images
- imagecontentpolicies
- ingresses
- proxies
- schedulers
- networks
- oauths
scope: '*'
validations:
- expression: request.userInfo.username in ['system:hosted-cluster-config'] || (has(object.spec)
&& has(oldObject.spec) && object.spec == oldObject.spec)
message: This resource cannot be created, updated, or deleted. Please ask your
administrator to modify the resource in the HostedCluster object.
reason: Invalid
status:
observedGeneration: 1
typeChecking: {}
Example control plane problems:
[prestage-dal10-carrier100] rtheis@prestage-mon01-carrier1-worker-1021:~$ kubectl get pods -n master-cukfgu110eiif3ngu98g | grep ago
cluster-image-registry-operator-d78bbf946-vz6bw 1/2 CrashLoopBackOff 15 (82s ago) 106m
cluster-network-operator-55bbd49b5d-pzg6b 3/3 Running 15 (5m42s ago) 106m
cluster-node-tuning-operator-6df7785f4f-fr5l2 0/1 CrashLoopBackOff 21 (84s ago) 106m
dns-operator-5cf4fc8595-6xxpp 0/1 CrashLoopBackOff 21 (78s ago) 106m
ingress-operator-5ddb8868cf-fkkwt 1/2 CrashLoopBackOff 21 (115s ago) 106m
[prestage-dal10-carrier100] rtheis@prestage-mon01-carrier1-worker-1021:~$
Example control plane pod crash:
[prestage-dal10-carrier100] rtheis@prestage-mon01-carrier1-worker-1021:~$ kubectl logs -n master-cukfgu110eiif3ngu98g cluster-image-registry-operator-d78bbf946-vz6bw -p --tail=10
Defaulted container "cluster-image-registry-operator" out of: cluster-image-registry-operator, client-token-minter
E0210 14:10:26.537119 1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"4.18.0-rc.8\" in featuregates.config.openshift.io/cluster" logger="UnhandledError"
E0210 14:10:26.859693 1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"4.18.0-rc.8\" in featuregates.config.openshift.io/cluster" logger="UnhandledError"
E0210 14:10:27.502210 1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"4.18.0-rc.8\" in featuregates.config.openshift.io/cluster" logger="UnhandledError"
E0210 14:10:28.784156 1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"4.18.0-rc.8\" in featuregates.config.openshift.io/cluster" logger="UnhandledError"
E0210 14:10:31.346885 1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"4.18.0-rc.8\" in featuregates.config.openshift.io/cluster" logger="UnhandledError"
E0210 14:10:36.469479 1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"4.18.0-rc.8\" in featuregates.config.openshift.io/cluster" logger="UnhandledError"
E0210 14:10:46.712338 1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"4.18.0-rc.8\" in featuregates.config.openshift.io/cluster" logger="UnhandledError"
E0210 14:11:07.195708 1 simple_featuregate_reader.go:290] "Unhandled Error" err="cluster failed with : unable to determine features: missing desired version \"4.18.0-rc.8\" in featuregates.config.openshift.io/cluster" logger="UnhandledError"
E0210 14:11:26.215863 1 starter.go:91] timed out waiting for FeatureGate detection
W0210 14:11:26.216539 1 builder.go:136] graceful termination failed, controllers failed with error: timed out waiting for FeatureGate detection
[prestage-dal10-carrier100] rtheis@prestage-mon01-carrier1-worker-1021:~$