Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Normal
Fix Version/s: None
Affects Version/s: 4.17, 4.18
Component/s: HyperShift
Labels:
None

Activity Type:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

 While upgrading the HostedCluster from 4.17.29 to 4.18.30/33, observed that the NodePool is being recreated during the HC upgrade itself, even before the actual NodePool upgrade begins. The NodePool reports a configuration change as the reason. As a result, customers are seeing two NodePool recycles during every upgrade:

Version-Release number of selected component (if applicable):

 MCE 2.9

How reproducible:

 Unable to reproduce apart from customer clusters

Find the details below:

1- The customer is patching both the HostedCluster (HC) and the NodePool simultaneously, and the NodePool is waiting for the HostedCluster upgrade to complete. Please see the status below:

 - lastTransitionTime: "2026-03-04T08:28:04Z"
    message: 'Failed to get release image: the latest version supported is: "4.17.29".
      Attempting to use: "4.18.33"'

2- While the HostedCluster upgrade is in progress—specifically at the stage where all ClusterOperators have been upgraded to the desired version and it is waiting for the Network ClusterOperator—the NodePool begins undergoing a configuration change.

% oc get nodepool -n clusters xxx
NAME              CLUSTER           DESIRED NODES   CURRENT NODES   AUTOSCALING   AUTOREPAIR   VERSION   UPDATINGVERSION   UPDATINGCONFIG   MESSAGE
xxx   xxx   2               2               False         True         4.17.29   False             True

$ oc get co
NAME                                       VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
console                                    4.18.33   True        False         False      25d
csi-snapshot-controller                    4.18.33   True        False         False      14m
dns                                        4.18.33   True        False         False      12m
image-registry                             4.18.33   True        False         False      14m
ingress                                    4.18.33   True        False         False      13m
insights                                   4.18.33   True        False         False      145d
kube-apiserver                             4.18.33   True        False         False      160d
kube-controller-manager                    4.18.33   True        False         False      160d
kube-scheduler                             4.18.33   True        False         False      160d
kube-storage-version-migrator              4.18.33   True        False         False      11m
monitoring                                 4.18.33   True        False         False      160d
network                                    4.17.29   True        True          False      160d    DaemonSet "/openshift-ovn-kubernetes/ovnkube-node" update is rolling out (2 out of 3 updated)...
node-tuning                                4.18.33   True        True          False      5m22s   Waiting for 1/3 Profiles to be applied
openshift-apiserver                        4.18.33   True        False         False      160d
openshift-controller-manager               4.18.33   True        False         False      160d
openshift-samples                          4.18.33   True        False         False      160d
operator-lifecycle-manager                 4.18.33   True        False         False      160d
operator-lifecycle-manager-catalog         4.18.33   True        False         False      160d
operator-lifecycle-manager-packageserver   4.18.33   True        False         False      160d
service-ca                                 4.18.33   True        False         False      160d
storage                                    4.18.33   True        False         False      160d

3- At the same time, the NodePool currentConfig is pointing to a secret that was created before the upgrade. The NodePool initially reports that a new secret cannot be found (which appears to be a transient error). Later, the new secret is created, and the NodePool then proceeds with the config change.

# oc get nodepool <name> -o yaml

hypershift.openshift.io/nodePoolCurrentConfig: 72e6e8e8     hypershift.openshift.io/nodePoolCurrentConfigVersion: 1d346cfe 

below are the secret which pointing to currentconfig

token-xxx-1d346cfe                        Opaque                           9      96d
user-data-xxx-1d346cfe                    Opaque                           2      96d
user-data-xxx-1d346cfe-userdata           cluster.x-k8s.io/secret          1      33d

Nodepool started complains below, once config updates become "True" dueing hc upgrade

 - lastTransitionTime: "2026-03-04T08:28:04Z"
    message: Secret "token-xxx-69b84919" not found
    observedGeneration: 5
    reason: NotFound
    status: "False"
    type: ReachedIgnitionEndpoint
  - lastTransitionTime: "2026-03-04T08:28:04Z"
    message: 'Updating config in progress. Target config: 1c4e6609'
    observedGeneration: 5
    reason: AsExpected
    status: "True"
    type: UpdatingConfig
  - lastTransitionTime: "2025-09-24T11:32:35Z"
    observedGeneration: 5
    reason: AsExpected
    status: "False"
    type: UpdatingVersion
  - lastTransitionTime: "2026-03-04T08:28:04Z"
    message: Secret "token-xxx-69b84919" not found
    observedGeneration: 5
    reason: NotFound
    status: "False"
    type: ValidGeneratedPayload

later this secret recognised and started the config update

token-xxx-69b84919                        Opaque                           8      18m
user-data-xxx-69b84919                    Opaque                           2      18m
user-data-xxx-69b84919-userdata           cluster.x-k8s.io/secret          1      18m

4- After the above configuration change is completed, the NodePool nodes are recreated again as part of the NodePool upgrade.

Please help us understand why the NodePool nodes are being recreated during the HostedCluster upgrade, even before the NodePool upgrade begins.

Actual results:

 Nodepools are getting recreated twice during hypershift upgrade

Expected results:

  Nodepools should be recreated only once during hypershift upgrade

Additional info:

Assignee:: Unassigned

Reporter:: MUHAMMED ASLAM V K

QA Contact:: Yu Li

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2026/03/05 10:14 AM

Updated:: 2026/03/10 10:00 AM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates

Hide