Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: 4.16.z
Component/s: Networking / kubernetes-nmstate-operator
Labels:
- nmstate

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:
None
Story Points:
None
Severity:
Moderate
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
None

RH Private Keywords:

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

    NMState NNCP remains in Progressing/ConfigurationProgressing state meanwhile the NNCE is however Available/SuccessfullyConfigured for that node (single node selector policy). This is causing all the CI/CD pipelines failing as the NNCP doesn't become Available.

NNCP:
$ oc get nncp | grep -i 3003-worker-3
vlan-infra-10g-3003-worker-3-policy     Progressing   ConfigurationProgressing

NNCE:
$ oc get nnce | grep -i 3003-worker-3
worker-3.vlan-infra-10g-3003-worker-3-policy     Available   2025-01-28T13:19:48Z   SuccessfullyConfigured

nmstate-handler pods are showing the following errors:
---
2025-01-28T13:19:48.615967539Z {"level":"info","ts":"2025-01-28T13:19:48.615Z","logger":"controllers.NodeNetworkConfigurationPolicy.forceNNSRefresh","msg":"forcing NodeNetworkState refresh after NNCP applied","node":"worker-3"}

2025-01-28T13:19:48.929845273Z {"level":"error","ts":"2025-01-28T13:19:48.929Z","logger":"controllers.NodeNetworkConfigurationPolicy","msg":"error decrementing unavailableNodeCount with non-cached client","error":"Internal error occurred: failed calling webhook \"nodene
tworkconfigurationpolicies-status-mutate.nmstate.io\": failed to call webhook: Post \"https://nmstate-webhook.openshift-nmstate.svc:443/nodenetworkconfigurationpolicies-status-mutate?timeout=10s\": tls: failed to verify certificate: x509: certificate signed by unknown a
uthority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"nmstate\")","stacktrace":"github.com/nmstate/kubernetes-nmstate/controllers/handler.
...}
2025-01-28T13:19:48.979492103Z {"level":"info","ts":"2025-01-28T13:19:48.979Z","logger":"policyconditions","msg":"numberOfNmstateMatchingNodes: 1, enactments count: {failed: {true: 0, false: 1, unknown: 0}, progressing: {true: 0, false: 1, unknown: 0}, pending: {true: 0
, false: 1, unknown: 0}, available: {true: 1, false: 0, unknown: 0}, aborted: {true: 0, false: 1, unknown: 0}}","policy":"vlan-infra-10g-3003-worker-3-policy"}
2025-01-28T13:19:48.979492103Z {"level":"info","ts":"2025-01-28T13:19:48.979Z","logger":"policyconditions","msg":"SetPolicySuccess"}
---

Partner knows about the solution 7101331 (deleting the nmstate pods) but it's not a viable workaround for CI/CD.

Is there anything to do to make the NNCP to succeed?

Version-Release number of selected component (if applicable):

    4.16.25

How reproducible:

    True

Steps to Reproduce:

    1. Apply NNCP policy 
    2. Check if NNCE is Available and reason SuccessfullyConfigured
    3. Go back to check NNCP and see if it's still Progressing with reason ConfigurationProgressing

Actual results:

    NNCP policy hanging in status Progressing

Expected results:

    NNCP policy with status Available

Additional info:

Assignee:: Mat Kowalski

Reporter:: Ivan Garcia

QA Contact:: Ross Brattain

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2025/01/31 4:03 PM

Updated:: 2025/10/09 8:44 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates