Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-45476

With multiple routes(and no table-id set) vlan interfaces are recreated on nmstate-operator update or nmstate-handler restart

XMLWordPrintable

    • None
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      Observed in RHOSO https://issues.redhat.com/browse/OSPRH-9899 where for a usecase we create multiple vlan interfaces and multiple routes attached. On these vlan interfaces we create macvlan NetworkAttachmentDefinitions and attach to pods.
      With nmstate-operator periodic updates(or restart of nmstate-handler) we noticed that some vlan interfaces get's recreated without any change to "NodeNetworkConfigurationPolicy" CR and this results into secondary nics(NetworkAttachmentDefinitions) removed from pods. And this requires pods to be recreated to get back the lost interfaces.
      From initial finding this only happens when multiple ip routes are involved for these vlan interfaces and table-id is not set explicitly.
      We currently working around by setting table-id optional field explicitly https://github.com/openstack-k8s-operators/architecture/pull/460 

       

      Version-Release number of selected component (if applicable):

      $ oc get csv -n openshift-nmstate    
      kubernetes-nmstate-operator.4.16.0-202411251535   Kubernetes NMState Operator    4.16.0-202411251535   kubernetes-nmstate-operator.4.16.0-202411190033   Succeeded
      
      $ oc version
      Client Version: 4.17.3
      Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
      Server Version: 4.16.0
      Kubernetes Version: v1.29.5+29c95f3

       

      How reproducible:

          Random interfaces with below CR(for me it was quite consistent with below CR on one or other interface, increasing routes/vlan interfaces can be done to get it more reproducible):-
      $ cat reproduce.yaml
      kind: NodeNetworkConfigurationPolicy
      metadata:
        name: test-vlan
      spec:
        desiredState:
          interfaces:
          - description: vlan interface 11
            ipv4:
              address:
              - ip: 172.11.0.5
                prefix-length: 24
              dhcp: false
              enabled: true
            ipv6:
              enabled: false
            name: enp5s0.11
            state: up
            type: vlan
            vlan:
              base-iface: enp5s0
              id: 21
          - description: vlan interface 12
            ipv4:
              address:
              - ip: 172.12.0.5
                prefix-length: 24
              dhcp: false
              enabled: true
            ipv6:
              enabled: false
            name: enp5s0.12
            state: up
            type: vlan
            vlan:
              base-iface: enp5s0
              id: 25
          routes:
            config:
            - destination: 172.11.10.0/24
              next-hop-address: 172.11.0.1
              next-hop-interface: enp5s0.11
            - destination: 172.11.20.0/24
              next-hop-address: 172.11.0.1
              next-hop-interface: enp5s0.11
            - destination: 172.11.30.0/24
              next-hop-address: 172.11.0.1
              next-hop-interface: enp5s0.11
            - destination: 172.11.40.0/24
              next-hop-address: 172.11.0.1
              next-hop-interface: enp5s0.11
            - destination: 172.12.10.0/24
              next-hop-address: 172.12.0.1
              next-hop-interface: enp5s0.12
            - destination: 172.12.20.0/24
              next-hop-address: 172.12.0.1
              next-hop-interface: enp5s0.12
            - destination: 172.12.30.0/24
              next-hop-address: 172.12.0.1
              next-hop-interface: enp5s0.12
            - destination: 172.12.40.0/24
              next-hop-address: 172.12.0.1
              next-hop-interface: enp5s0.12

       

      Steps to Reproduce:

          1. oc apply -f reproduce.yaml
          2. check interfaces id with ip a | grep enp5s0
          3. Delete nmstate handler pod with oc delete -n openshift-nmstate $(oc get pod -n openshift-nmstate -l component=kubernetes-nmstate-handler --no-headers -o name)
          4. Wait for nncp to be reapplied oc oc get nncp test-vlan -w    
          5. Recheck interfaces id with ip a | grep enp5s0
      
      Interface id changes when it reproduces like below
      [zuul@controller-0 ~]$ oc apply -f reproduce.yaml 
      nodenetworkconfigurationpolicy.nmstate.io/test-vlan created
      
      [zuul@controller-0 ~]$ oc get nncp test-vlan -w
      NAME        STATUS        REASON
      test-vlan   Progressing   ConfigurationProgressing
      test-vlan   Progressing   ConfigurationProgressing
      test-vlan   Available     SuccessfullyConfigured
      [zuul@controller-0 ~]$ ssh crc-0 ip a|grep enp5s0
      Warning: Permanently added 'crc-0.utility' (ED25519) to the list of known hosts.
      2: enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master ovs-system state UP group default qlen 1000
      9940: enp5s0.11@enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
          inet 172.11.0.5/24 brd 172.11.0.255 scope global noprefixroute enp5s0.11
      9941: enp5s0.12@enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
          inet 172.12.0.5/24 brd 172.12.0.255 scope global noprefixroute enp5s0.12
      [zuul@controller-0 ~]$ ssh crc-0 nmcli -g ipv4.routes c show enp5s0.11
      Warning: Permanently added 'crc-0.utility' (ED25519) to the list of known hosts.
      172.11.10.0/24 172.11.0.1 0, 172.11.20.0/24 172.11.0.1 0, 172.11.30.0/24 172.11.0.1 0, 172.11.40.0/24 172.11.0.1 0
      [zuul@controller-0 ~]$ ssh crc-0 nmcli -g ipv4.routes c show enp5s0.12
      Warning: Permanently added 'crc-0.utility' (ED25519) to the list of known hosts.
      172.12.10.0/24 172.12.0.1 0, 172.12.20.0/24 172.12.0.1 0, 172.12.30.0/24 172.12.0.1 0, 172.12.40.0/24 172.12.0.1 0
      
      [zuul@controller-0 ~]$ oc delete -n openshift-nmstate $(oc get pod -n openshift-nmstate -l component=kubernetes-nmstate-handler --no-headers -o name)
      pod "nmstate-handler-f7qlf" deleted
      
      [zuul@controller-0 ~]$ oc get nncp test-vlan -w
      NAME        STATUS      REASON
      test-vlan   Available   SuccessfullyConfigured
      test-vlan               
      test-vlan               
      test-vlan   Progressing   ConfigurationProgressing
      test-vlan   Progressing   ConfigurationProgressing
      test-vlan   Available     SuccessfullyConfigured
      
      [zuul@controller-0 ~]$ ssh crc-0 ip a|grep enp5s0
      Warning: Permanently added 'crc-0.utility' (ED25519) to the list of known hosts.
      2: enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master ovs-system state UP group default qlen 1000
      9940: enp5s0.11@enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
          inet 172.11.0.5/24 brd 172.11.0.255 scope global noprefixroute enp5s0.11
      9944: enp5s0.12@enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
          inet 172.12.0.5/24 brd 172.12.0.255 scope global noprefixroute enp5s0.12
      
      # enp5s0.12 got recreated as can see interface id changed from 9941 to 9944
      
      [zuul@controller-0 ~]$ ssh crc-0 nmcli -g ipv4.routes c show enp5s0.11
      Warning: Permanently added 'crc-0.utility' (ED25519) to the list of known hosts.
      172.11.10.0/24 172.11.0.1 0 table=254, 172.11.20.0/24 172.11.0.1 0 table=254, 172.11.30.0/24 172.11.0.1 0 table=254, 172.11.40.0/24 172.11.0.1 0 table=254
      [zuul@controller-0 ~]$ ssh crc-0 nmcli -g ipv4.routes c show enp5s0.12
      Warning: Permanently added 'crc-0.utility' (ED25519) to the list of known hosts.
      172.12.10.0/24 172.12.0.1 0, 172.12.20.0/24 172.12.0.1 0 table=254, 172.12.30.0/24 172.12.0.1 0, 172.12.40.0/24 172.12.0.1 0
      
      # enp5s0.11 not changed likely because all 4 routes of it has table=254 set

       

      Actual results:

          Interfaces recreated without any change in desired state

       

       

       

      Expected results:

          Interfaces should not get recreated with new nmstate operator or nmstate handler restart without any change in desiredState

      Additional info:

          Issue do not reproduce if table-id is set explicitly i.e 
      $ cat noreproduce.yaml
      apiVersion: nmstate.io/v1
      kind: NodeNetworkConfigurationPolicy
      metadata:
        name: test-vlan
      spec:
        desiredState:
          interfaces:
          - description: vlan interface 11
            ipv4:
              address:
              - ip: 172.11.0.5
                prefix-length: 24
              dhcp: false
              enabled: true
            ipv6:
              enabled: false
            name: enp5s0.11
            state: up
            type: vlan
            vlan:
              base-iface: enp5s0
              id: 21
          - description: vlan interface 12
            ipv4:
              address:
              - ip: 172.12.0.5
                prefix-length: 24
              dhcp: false
              enabled: true
            ipv6:
              enabled: false
            name: enp5s0.12
            state: up
            type: vlan
            vlan:
              base-iface: enp5s0
              id: 25
          routes:
            config:       - destination: 172.11.10.0/24
              next-hop-address: 172.11.0.1
              next-hop-interface: enp2s0.11
              table-id: 254
            - destination: 172.11.20.0/24
              next-hop-address: 172.11.0.1
              next-hop-interface: enp2s0.11
              table-id: 254
            - destination: 172.11.30.0/24
              next-hop-address: 172.11.0.1
              next-hop-interface: enp2s0.11
              table-id: 254
            - destination: 172.11.40.0/24
              next-hop-address: 172.11.0.1
              next-hop-interface: enp2s0.11
              table-id: 254
            - destination: 172.12.10.0/24
              next-hop-address: 172.12.0.1
              next-hop-interface: enp2s0.12
              table-id: 254
            - destination: 172.12.20.0/24
              next-hop-address: 172.12.0.1
              next-hop-interface: enp2s0.12
              table-id: 254
            - destination: 172.12.30.0/24
              next-hop-address: 172.12.0.1
              next-hop-interface: enp2s0.12
              table-id: 254
            - destination: 172.12.40.0/24
              next-hop-address: 172.12.0.1
              next-hop-interface: enp2s0.12
              table-id: 254

       

              bnemec@redhat.com Benjamin Nemec
              ykarel@redhat.com Yatin Karel
              Ross Brattain Ross Brattain
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: