-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.16.z
-
Quality / Stability / Reliability
-
False
-
None
-
None
-
Moderate
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
NMState NNCP remains in Progressing/ConfigurationProgressing state meanwhile the NNCE is however Available/SuccessfullyConfigured for that node (single node selector policy). This is causing all the CI/CD pipelines failing as the NNCP doesn't become Available. NNCP: $ oc get nncp | grep -i 3003-worker-3 vlan-infra-10g-3003-worker-3-policy Progressing ConfigurationProgressing NNCE: $ oc get nnce | grep -i 3003-worker-3 worker-3.vlan-infra-10g-3003-worker-3-policy Available 2025-01-28T13:19:48Z SuccessfullyConfigured nmstate-handler pods are showing the following errors: --- 2025-01-28T13:19:48.615967539Z {"level":"info","ts":"2025-01-28T13:19:48.615Z","logger":"controllers.NodeNetworkConfigurationPolicy.forceNNSRefresh","msg":"forcing NodeNetworkState refresh after NNCP applied","node":"worker-3"} 2025-01-28T13:19:48.929845273Z {"level":"error","ts":"2025-01-28T13:19:48.929Z","logger":"controllers.NodeNetworkConfigurationPolicy","msg":"error decrementing unavailableNodeCount with non-cached client","error":"Internal error occurred: failed calling webhook \"nodene tworkconfigurationpolicies-status-mutate.nmstate.io\": failed to call webhook: Post \"https://nmstate-webhook.openshift-nmstate.svc:443/nodenetworkconfigurationpolicies-status-mutate?timeout=10s\": tls: failed to verify certificate: x509: certificate signed by unknown a uthority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"nmstate\")","stacktrace":"github.com/nmstate/kubernetes-nmstate/controllers/handler. ...} 2025-01-28T13:19:48.979492103Z {"level":"info","ts":"2025-01-28T13:19:48.979Z","logger":"policyconditions","msg":"numberOfNmstateMatchingNodes: 1, enactments count: {failed: {true: 0, false: 1, unknown: 0}, progressing: {true: 0, false: 1, unknown: 0}, pending: {true: 0 , false: 1, unknown: 0}, available: {true: 1, false: 0, unknown: 0}, aborted: {true: 0, false: 1, unknown: 0}}","policy":"vlan-infra-10g-3003-worker-3-policy"} 2025-01-28T13:19:48.979492103Z {"level":"info","ts":"2025-01-28T13:19:48.979Z","logger":"policyconditions","msg":"SetPolicySuccess"} --- Partner knows about the solution 7101331 (deleting the nmstate pods) but it's not a viable workaround for CI/CD. Is there anything to do to make the NNCP to succeed?
Version-Release number of selected component (if applicable):
4.16.25
How reproducible:
True
Steps to Reproduce:
1. Apply NNCP policy 2. Check if NNCE is Available and reason SuccessfullyConfigured 3. Go back to check NNCP and see if it's still Progressing with reason ConfigurationProgressing
Actual results:
NNCP policy hanging in status Progressing
Expected results:
NNCP policy with status Available
Additional info: