-
Bug
-
Resolution: Won't Do
-
Undefined
-
None
-
4.17
-
Important
-
None
-
Rejected
-
False
-
-
-
Known Issue
-
Done
-
Description of problem:
The configuration failed to revert on one worker, while the other applied it successfully.
Version-Release number of selected component (if applicable):
kubernetes-nmstate-operator.4.17.0-202409161407
How reproducible:
100%
Steps to Reproduce:
1.
- Change bond mode to balance-rr
- Run traffic/failover tests
- Revert bond mode back to active-backup
Actual results:
works fine without the port configuration:
$ oc describe nncp -A Name: changebondmode Namespace: Labels: <none> Annotations: nmstate.io/webhook-mutating-timestamp: 1726368718988958883 API Version: nmstate.io/v1 Kind: NodeNetworkConfigurationPolicy Metadata: Creation Timestamp: 2024-09-15T02:51:58Z Generation: 1 Resource Version: 776290 UID: 47fcf71b-9b75-416b-8836-dd55926feda4 Spec: Desired State: Interfaces: Link - Aggregation: Mode: active-backup Options: fail_over_mac: none Port: ens1f0v0 ens1f1v0 Name: bond0 State: up Type: bond Node Selector: node-role.kubernetes.io/workercnf: Status: Conditions: Last Heartbeat Time: 2024-09-15T02:54:14Z Last Transition Time: 2024-09-15T02:54:14Z Reason: FailedToConfigure Status: False Type: Available Last Heartbeat Time: 2024-09-15T02:54:14Z Last Transition Time: 2024-09-15T02:54:14Z Message: 1/2 nodes failed to configure Reason: FailedToConfigure Status: True Type: Degraded Last Heartbeat Time: 2024-09-15T02:54:14Z Last Transition Time: 2024-09-15T02:54:14Z Reason: ConfigurationProgressing Status: False Type: Progressing Last Unavailable Node Count Update: 2024-09-15T02:54:14Z Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning ReconcileFailed 8m19s worker-1.nmstate-handler error reconciling NodeNetworkConfigurationPolicy on node worker-1 at desired state apply: "", rolling back desired state configuration: failed runnig probes after network changes: failed runnig probe 'ping' with after network reconfiguration -> currentState: hostname: ....... external_ids: hostname: worker-1 ovn-enable-lflow-cache: 'true' ovn-encap-ip: 10.46.77.3 ovn-encap-type: geneve ovn-is-interconn: 'true' ovn-memlimit-lflow-cache-kb: '1048576' ovn-monitor-all: 'true' ovn-ofctrl-wait-before-clear: '0' ovn-openflow-probe-interval: '180' ovn-remote: unix:/var/run/ovn/ovnsb_db.sock ovn-remote-probe-interval: '180000' ovn-set-local-ip: 'true' rundir: /var/run/openvswitch system-id: 4f58abc6-2246-4606-82d8-3d574d99cf55 other_config: bundle-idle-timeout: '180' ovn-chassis-idx-4f58abc6-2246-4606-82d8-3d574d99cf55: '' vlan-limit: '0' ovn: bridge-mappings: - localnet: physnet bridge: br-ex : context deadline exceeded
Expected results:
Additional info:
$ oc get nncp -A -oyaml apiVersion: v1 items: - apiVersion: nmstate.io/v1 kind: NodeNetworkConfigurationPolicy metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"nmstate.io/v1","kind":"NodeNetworkConfigurationPolicy","metadata":{"annotations":{},"name":"mode"},"spec":{"desiredState":{"interfaces":[{"description":"Change mode","link-aggregation":{"mode":"active-backup"},"name":"bond0","state":"up","type":"bond"}]},"maxUnavailable":3,"nodeSelector":{"node-role.kubernetes.io/worker":""}}} nmstate.io/webhook-mutating-timestamp: "1726509854030132671" creationTimestamp: "2024-09-16T18:04:14Z" generation: 1 name: mode resourceVersion: "79454" uid: 7acb5e9e-f374-4b45-a70f-c5fba2dfc4bd spec: desiredState: interfaces: - description: Change mode link-aggregation: mode: active-backup name: bond0 state: up type: bond maxUnavailable: 3 nodeSelector: node-role.kubernetes.io/worker: "" status: conditions: - lastHeartbeatTime: "2024-09-16T18:04:18Z" lastTransitionTime: "2024-09-16T18:04:18Z" message: 2/2 nodes successfully configured reason: SuccessfullyConfigured status: "True" type: Available - lastHeartbeatTime: "2024-09-16T18:04:18Z" lastTransitionTime: "2024-09-16T18:04:18Z" reason: SuccessfullyConfigured status: "False" type: Degraded - lastHeartbeatTime: "2024-09-16T18:04:18Z" lastTransitionTime: "2024-09-16T18:04:18Z" reason: ConfigurationProgressing status: "False" type: Progressing lastUnavailableNodeCountUpdate: "2024-09-16T18:04:18Z" kind: List metadata: resourceVersion: ""