-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.17, 4.18
-
None
-
None
-
False
-
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
While upgrading the HostedCluster from 4.17.29 to 4.18.30/33, observed that the NodePool is being recreated during the HC upgrade itself, even before the actual NodePool upgrade begins. The NodePool reports a configuration change as the reason. As a result, customers are seeing two NodePool recycles during every upgrade:
Version-Release number of selected component (if applicable):
MCE 2.9
How reproducible:
Unable to reproduce apart from customer clusters
Find the details below:
1- The customer is patching both the HostedCluster (HC) and the NodePool simultaneously, and the NodePool is waiting for the HostedCluster upgrade to complete. Please see the status below:
- lastTransitionTime: "2026-03-04T08:28:04Z" message: 'Failed to get release image: the latest version supported is: "4.17.29". Attempting to use: "4.18.33"'
2- While the HostedCluster upgrade is in progress—specifically at the stage where all ClusterOperators have been upgraded to the desired version and it is waiting for the Network ClusterOperator—the NodePool begins undergoing a configuration change.
% oc get nodepool -n clusters xxx NAME CLUSTER DESIRED NODES CURRENT NODES AUTOSCALING AUTOREPAIR VERSION UPDATINGVERSION UPDATINGCONFIG MESSAGE xxx xxx 2 2 False True 4.17.29 False True $ oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE console 4.18.33 True False False 25d csi-snapshot-controller 4.18.33 True False False 14m dns 4.18.33 True False False 12m image-registry 4.18.33 True False False 14m ingress 4.18.33 True False False 13m insights 4.18.33 True False False 145d kube-apiserver 4.18.33 True False False 160d kube-controller-manager 4.18.33 True False False 160d kube-scheduler 4.18.33 True False False 160d kube-storage-version-migrator 4.18.33 True False False 11m monitoring 4.18.33 True False False 160d network 4.17.29 True True False 160d DaemonSet "/openshift-ovn-kubernetes/ovnkube-node" update is rolling out (2 out of 3 updated)... node-tuning 4.18.33 True True False 5m22s Waiting for 1/3 Profiles to be applied openshift-apiserver 4.18.33 True False False 160d openshift-controller-manager 4.18.33 True False False 160d openshift-samples 4.18.33 True False False 160d operator-lifecycle-manager 4.18.33 True False False 160d operator-lifecycle-manager-catalog 4.18.33 True False False 160d operator-lifecycle-manager-packageserver 4.18.33 True False False 160d service-ca 4.18.33 True False False 160d storage 4.18.33 True False False 160d
3- At the same time, the NodePool currentConfig is pointing to a secret that was created before the upgrade. The NodePool initially reports that a new secret cannot be found (which appears to be a transient error). Later, the new secret is created, and the NodePool then proceeds with the config change.
# oc get nodepool <name> -o yaml hypershift.openshift.io/nodePoolCurrentConfig: 72e6e8e8 hypershift.openshift.io/nodePoolCurrentConfigVersion: 1d346cfe below are the secret which pointing to currentconfig token-xxx-1d346cfe Opaque 9 96d user-data-xxx-1d346cfe Opaque 2 96d user-data-xxx-1d346cfe-userdata cluster.x-k8s.io/secret 1 33d Nodepool started complains below, once config updates become "True" dueing hc upgrade - lastTransitionTime: "2026-03-04T08:28:04Z" message: Secret "token-xxx-69b84919" not found observedGeneration: 5 reason: NotFound status: "False" type: ReachedIgnitionEndpoint - lastTransitionTime: "2026-03-04T08:28:04Z" message: 'Updating config in progress. Target config: 1c4e6609' observedGeneration: 5 reason: AsExpected status: "True" type: UpdatingConfig - lastTransitionTime: "2025-09-24T11:32:35Z" observedGeneration: 5 reason: AsExpected status: "False" type: UpdatingVersion - lastTransitionTime: "2026-03-04T08:28:04Z" message: Secret "token-xxx-69b84919" not found observedGeneration: 5 reason: NotFound status: "False" type: ValidGeneratedPayload later this secret recognised and started the config update token-xxx-69b84919 Opaque 8 18m user-data-xxx-69b84919 Opaque 2 18m user-data-xxx-69b84919-userdata cluster.x-k8s.io/secret 1 18m
4- After the above configuration change is completed, the NodePool nodes are recreated again as part of the NodePool upgrade.
Please help us understand why the NodePool nodes are being recreated during the HostedCluster upgrade, even before the NodePool upgrade begins.
Actual results:
Nodepools are getting recreated twice during hypershift upgrade
Expected results:
Nodepools should be recreated only once during hypershift upgrade
Additional info: