-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
4.16.z
-
None
-
None
-
False
-
Description of problem:
Observed in RHOSO https://issues.redhat.com/browse/OSPRH-9899 where for a usecase we create multiple vlan interfaces and multiple routes attached. On these vlan interfaces we create macvlan NetworkAttachmentDefinitions and attach to pods. With nmstate-operator periodic updates(or restart of nmstate-handler) we noticed that some vlan interfaces get's recreated without any change to "NodeNetworkConfigurationPolicy" CR and this results into secondary nics(NetworkAttachmentDefinitions) removed from pods. And this requires pods to be recreated to get back the lost interfaces. From initial finding this only happens when multiple ip routes are involved for these vlan interfaces and table-id is not set explicitly. We currently working around by setting table-id optional field explicitly https://github.com/openstack-k8s-operators/architecture/pull/460
Version-Release number of selected component (if applicable):
$ oc get csv -n openshift-nmstate kubernetes-nmstate-operator.4.16.0-202411251535 Kubernetes NMState Operator 4.16.0-202411251535 kubernetes-nmstate-operator.4.16.0-202411190033 Succeeded $ oc version Client Version: 4.17.3 Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3 Server Version: 4.16.0 Kubernetes Version: v1.29.5+29c95f3
How reproducible:
Random interfaces with below CR(for me it was quite consistent with below CR on one or other interface, increasing routes/vlan interfaces can be done to get it more reproducible):- $ cat reproduce.yaml kind: NodeNetworkConfigurationPolicy metadata: name: test-vlan spec: desiredState: interfaces: - description: vlan interface 11 ipv4: address: - ip: 172.11.0.5 prefix-length: 24 dhcp: false enabled: true ipv6: enabled: false name: enp5s0.11 state: up type: vlan vlan: base-iface: enp5s0 id: 21 - description: vlan interface 12 ipv4: address: - ip: 172.12.0.5 prefix-length: 24 dhcp: false enabled: true ipv6: enabled: false name: enp5s0.12 state: up type: vlan vlan: base-iface: enp5s0 id: 25 routes: config: - destination: 172.11.10.0/24 next-hop-address: 172.11.0.1 next-hop-interface: enp5s0.11 - destination: 172.11.20.0/24 next-hop-address: 172.11.0.1 next-hop-interface: enp5s0.11 - destination: 172.11.30.0/24 next-hop-address: 172.11.0.1 next-hop-interface: enp5s0.11 - destination: 172.11.40.0/24 next-hop-address: 172.11.0.1 next-hop-interface: enp5s0.11 - destination: 172.12.10.0/24 next-hop-address: 172.12.0.1 next-hop-interface: enp5s0.12 - destination: 172.12.20.0/24 next-hop-address: 172.12.0.1 next-hop-interface: enp5s0.12 - destination: 172.12.30.0/24 next-hop-address: 172.12.0.1 next-hop-interface: enp5s0.12 - destination: 172.12.40.0/24 next-hop-address: 172.12.0.1 next-hop-interface: enp5s0.12
Steps to Reproduce:
1. oc apply -f reproduce.yaml 2. check interfaces id with ip a | grep enp5s0 3. Delete nmstate handler pod with oc delete -n openshift-nmstate $(oc get pod -n openshift-nmstate -l component=kubernetes-nmstate-handler --no-headers -o name) 4. Wait for nncp to be reapplied oc oc get nncp test-vlan -w 5. Recheck interfaces id with ip a | grep enp5s0 Interface id changes when it reproduces like below [zuul@controller-0 ~]$ oc apply -f reproduce.yaml nodenetworkconfigurationpolicy.nmstate.io/test-vlan created [zuul@controller-0 ~]$ oc get nncp test-vlan -w NAME STATUS REASON test-vlan Progressing ConfigurationProgressing test-vlan Progressing ConfigurationProgressing test-vlan Available SuccessfullyConfigured [zuul@controller-0 ~]$ ssh crc-0 ip a|grep enp5s0 Warning: Permanently added 'crc-0.utility' (ED25519) to the list of known hosts. 2: enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master ovs-system state UP group default qlen 1000 9940: enp5s0.11@enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 inet 172.11.0.5/24 brd 172.11.0.255 scope global noprefixroute enp5s0.11 9941: enp5s0.12@enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 inet 172.12.0.5/24 brd 172.12.0.255 scope global noprefixroute enp5s0.12 [zuul@controller-0 ~]$ ssh crc-0 nmcli -g ipv4.routes c show enp5s0.11 Warning: Permanently added 'crc-0.utility' (ED25519) to the list of known hosts. 172.11.10.0/24 172.11.0.1 0, 172.11.20.0/24 172.11.0.1 0, 172.11.30.0/24 172.11.0.1 0, 172.11.40.0/24 172.11.0.1 0 [zuul@controller-0 ~]$ ssh crc-0 nmcli -g ipv4.routes c show enp5s0.12 Warning: Permanently added 'crc-0.utility' (ED25519) to the list of known hosts. 172.12.10.0/24 172.12.0.1 0, 172.12.20.0/24 172.12.0.1 0, 172.12.30.0/24 172.12.0.1 0, 172.12.40.0/24 172.12.0.1 0 [zuul@controller-0 ~]$ oc delete -n openshift-nmstate $(oc get pod -n openshift-nmstate -l component=kubernetes-nmstate-handler --no-headers -o name) pod "nmstate-handler-f7qlf" deleted [zuul@controller-0 ~]$ oc get nncp test-vlan -w NAME STATUS REASON test-vlan Available SuccessfullyConfigured test-vlan test-vlan test-vlan Progressing ConfigurationProgressing test-vlan Progressing ConfigurationProgressing test-vlan Available SuccessfullyConfigured [zuul@controller-0 ~]$ ssh crc-0 ip a|grep enp5s0 Warning: Permanently added 'crc-0.utility' (ED25519) to the list of known hosts. 2: enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master ovs-system state UP group default qlen 1000 9940: enp5s0.11@enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 inet 172.11.0.5/24 brd 172.11.0.255 scope global noprefixroute enp5s0.11 9944: enp5s0.12@enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 inet 172.12.0.5/24 brd 172.12.0.255 scope global noprefixroute enp5s0.12 # enp5s0.12 got recreated as can see interface id changed from 9941 to 9944 [zuul@controller-0 ~]$ ssh crc-0 nmcli -g ipv4.routes c show enp5s0.11 Warning: Permanently added 'crc-0.utility' (ED25519) to the list of known hosts. 172.11.10.0/24 172.11.0.1 0 table=254, 172.11.20.0/24 172.11.0.1 0 table=254, 172.11.30.0/24 172.11.0.1 0 table=254, 172.11.40.0/24 172.11.0.1 0 table=254 [zuul@controller-0 ~]$ ssh crc-0 nmcli -g ipv4.routes c show enp5s0.12 Warning: Permanently added 'crc-0.utility' (ED25519) to the list of known hosts. 172.12.10.0/24 172.12.0.1 0, 172.12.20.0/24 172.12.0.1 0 table=254, 172.12.30.0/24 172.12.0.1 0, 172.12.40.0/24 172.12.0.1 0 # enp5s0.11 not changed likely because all 4 routes of it has table=254 set
Actual results:
Interfaces recreated without any change in desired state
Expected results:
Interfaces should not get recreated with new nmstate operator or nmstate handler restart without any change in desiredState
Additional info:
Issue do not reproduce if table-id is set explicitly i.e $ cat noreproduce.yaml apiVersion: nmstate.io/v1 kind: NodeNetworkConfigurationPolicy metadata: name: test-vlan spec: desiredState: interfaces: - description: vlan interface 11 ipv4: address: - ip: 172.11.0.5 prefix-length: 24 dhcp: false enabled: true ipv6: enabled: false name: enp5s0.11 state: up type: vlan vlan: base-iface: enp5s0 id: 21 - description: vlan interface 12 ipv4: address: - ip: 172.12.0.5 prefix-length: 24 dhcp: false enabled: true ipv6: enabled: false name: enp5s0.12 state: up type: vlan vlan: base-iface: enp5s0 id: 25 routes: config: - destination: 172.11.10.0/24 next-hop-address: 172.11.0.1 next-hop-interface: enp2s0.11 table-id: 254 - destination: 172.11.20.0/24 next-hop-address: 172.11.0.1 next-hop-interface: enp2s0.11 table-id: 254 - destination: 172.11.30.0/24 next-hop-address: 172.11.0.1 next-hop-interface: enp2s0.11 table-id: 254 - destination: 172.11.40.0/24 next-hop-address: 172.11.0.1 next-hop-interface: enp2s0.11 table-id: 254 - destination: 172.12.10.0/24 next-hop-address: 172.12.0.1 next-hop-interface: enp2s0.12 table-id: 254 - destination: 172.12.20.0/24 next-hop-address: 172.12.0.1 next-hop-interface: enp2s0.12 table-id: 254 - destination: 172.12.30.0/24 next-hop-address: 172.12.0.1 next-hop-interface: enp2s0.12 table-id: 254 - destination: 172.12.40.0/24 next-hop-address: 172.12.0.1 next-hop-interface: enp2s0.12 table-id: 254
- is triggered by
-
OSPRH-9899 ovn-controller loses the connection to ovsdbservers after nmstate is automatically upgraded to newer version
-
- Verified
-