Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Normal
Fix Version/s: rhwa-4.21-0
Affects Version/s: rhwa-25.8
Component/s: Node Healthcheck
Labels:

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Release Note Text:

Hide
Cause: Node label changes don't trigger NHC reconcile.
Consequence: NHC status observedNodes wasn't updated when node label changes result in a changed number nodes matching the NHC node label selector. Note: this doesn't affect actual remediation.
Fix: Trigger reconcile when relevant node label change.
Result: NHC status observedNodes is updated.

Show
Cause: Node label changes don't trigger NHC reconcile. Consequence: NHC status observedNodes wasn't updated when node label changes result in a changed number nodes matching the NHC node label selector. Note: this doesn't affect actual remediation. Fix: Trigger reconcile when relevant node label change. Result: NHC status observedNodes is updated.
Release Note Type:
Bug Fix
Release Note Status:
Proposed
Intelligence Requested:
Market:

Severity:
Moderate

Target Version:

rhwa-4.21-0

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

The NodeHealthCheck (NHC) controller correctly identifies the initial set of nodes matching its spec.selector when the NHC object is first created.

However, if a node that did not initially match the selector is later updated (i.e., a new label is added) to match the selector, the NHC controller fails to detect this change. The node is not added to the .status.observedNodes list and is consequently not monitored for health.

The only way to force the controller to recognize the newly labeled node is to either delete/recreate the NHC object or restart the NHC controller manager pod.

Steps to Reproduce

1. Prerequisites: A cluster with the NodeHealthCheck operator running and at least two worker nodes (e.g., node-1 and node-2).

Initial Node Labeling:

Label node-1 with both required labels:

oc label node node-1 hypershift.openshift.io/nodePool=test
oc label node node-1 fencing=true

Label node-2 with only one of the required labels:

oc label node node-2 hypershift.openshift.io/nodePool=test

2. Create NodeHealthCheck Object:

Apply the following NodeHealthCheck CR, which selects on both labels:

  selector:
    matchLabels:
      hypershift.openshift.io/nodePool: test
      fencing: "true"

3. Observe Initial State:

Check the status of the newly created NHC object:

oc get nhc <name> -n openshift-workload-availability -o yaml | grep -i observedNodes
  observedNodes: 1

4. Trigger the Bug (Update Node 2):

Add the missing fencing: "true" label to node-2, making it a valid target for the NHC selector:

oc label node node-2 fencing=true

node-2 now fully matches spec.selector.matchLabels.

5. Observe Final State:
Wait several minutes for the controller to reconcile and check the status again:

oc get nhc <name> -n openshift-workload-availability -o yaml | grep -i observedNodes

Actual Results

The status.observedNodes field remains at 1. The controller never detects that node-2 now matches the selector, and the node is not monitored.

Expected Results

After node-2 is labeled, the NHC controller's reconciliation loop should detect the change, re-evaluate its selector, and add node-2 to its list of monitored nodes.

The status.observedNodes field should update to 2.

Workaround

Manually forcing the controller to re-initialize its list of nodes "fixes" the issue:

Workaround 1: Delete and recreate the NHC object.
Workaround 2: Restart the NHC controller manager pod.

oc delete pods -l app.kubernetes.io/component=controller-manager -n openshift-workload-availability

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

rhwa 387.html
4 kB
2026/02/15 1:10 PM

links to

medik8s/node-healthcheck-operator#386: Reconcile on node label changes for correct NHC status update

Assignee:: Marc Sluiter

Reporter:: Vedant Durgam

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2025/10/28 2:39 PM

Updated:: 2026/02/15 1:18 PM

Resolved:: 2026/02/15 1:10 PM

Details

Description

Steps to Reproduce

Actual Results

Expected Results

Workaround

Attachments

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty