Loading...

XML

Word

Printable

Type: Bug
Resolution: Won't Do
Priority: Major
Fix Version/s: None
Affects Version/s: 4.16.z
Component/s: Networking / kubernetes-nmstate-operator
Labels:
None

Activity Type:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
Important
Regression:
None

Target Backport Versions:
None
Target Version:

4.16.z
Release Blocker:
None
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

     NNCE Enters "Failing" State After Node Reboot due to API Server Probe Timeout

Version-Release number of selected component (if applicable):

    4.16.54

How reproducible:

When a node with an active NodeNetworkConfigurationPolicy (NNCP) is rebooted, the nmstate-handler attempts to reconcile the state immediately upon startup. The reconciliation fails with a "failed running probe 'api-server'" error, causing the NodeNetworkConfigurationEnactment (NNCE) to enter a Failing state.Restarting the nmstate-handler daemonset (oc rollout restart ds/nmstate-handler) resolves the issue immediately.

Steps to Reproduce:

1. Apply a valid NodeNetworkConfigurationPolicy for static routes)
   apiVersion: nmstate.io/v1

kind: NodeNetworkConfigurationPolicy
metadata:
  labels:
    app.kubernetes.io/instance: openshift-nmstate-config
  name: static-route-hnas-workers
spec:
  desiredState:
    routes:
      config:
      - destination: x.x.x.x/32
        next-hop-address: x.x.x.x
        next-hop-interface: br-ex
        table-id: 254
  nodeSelector:
    node-role.kubernetes.io/worker: ""


2. Verify the NNCE status is Available and SuccessfullyConfigured.

oc get nnce
Thu Jan 15 04:42:47 PM CET 2026
NAME                                                         STATUS      STATUS AGE   REASON
1-workers   Available   2d5h         SuccessfullyConfigured
2-workers   Available   2d5h         SuccessfullyConfigured
3-workers   Available   2d5h         SuccessfullyConfigured
4-workers   Available   2d5h         SuccessfullyConfigured

3. Reboot the worker node (systemctl reboot).

4. Wait for the node to return to Ready state. 

5. Check the NNCE status for that node.

NAME                                                         STATUS      STATUS AGE   REASON
1-workers   Available   2d5h         SuccessfullyConfigured
2-workers   Available   2d5h         SuccessfullyConfigured
3-workers   Available   2d5h         SuccessfullyConfigured
4-workers   Failing     55s          FailedToConfigure

   message: |-
  error reconciling NodeNetworkConfigurationPolicy on node <node-name> desired state apply: "",
  rolling back desired state configuration: failed runnig probes after network changes: failed runnig probe 'api-server' with after network reconfiguration -> currentState: hostname:
  running: node-name

Resolve:

Need to restart the handler pods.

Actual results:

 The NNCE goes into Failing state after reboot.

Expected results:

   The NNCE after node reboot automatically reconcile to available state.

Additional info:

Assignee:: Mat Kowalski

Reporter:: Shivam shinde

QA Contact:: Ross Brattain

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2026/01/19 11:08 AM

Updated:: 2026/03/02 3:45 PM

Resolved:: 2026/02/23 1:40 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates