Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-14107

NMstate: Failed add one more slave interface to a bond interface

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      Need backport of NM fix to 1.42 because that is the version used in OCP.

      Show
      Need backport of NM fix to 1.42 because that is the version used in OCP.
    • None
    • None
    • No
    • Hide
      5/9 Needs to be retested as OPNET-282 is coming as Tech Preview in 4.16. Color: green
      08/18: The issue is resolved, however the build is not available with the fix. Ben to verify if this needs to backported
      Show
      5/9 Needs to be retested as OPNET-282 is coming as Tech Preview in 4.16. Color: green 08/18: The issue is resolved, however the build is not available with the fix. Ben to verify if this needs to backported
    • None
    • None
    • Rejected
    • CNF Network Sprint 239, CNF Network Sprint 240
    • 2
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      Initial setup:
      OCP Baremetal cluster
      VFs ==> interface vlan ==> bond as a master ovs-system
      
      
      2: ens1f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
          link/ether 40:a6:b7:71:b2:a0 brd ff:ff:ff:ff:ff:ff
          vf 0     link/ether 9e:17:41:d9:da:b7 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust on
          vf 1     link/ether 2a:cd:0d:f8:2e:69 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust on
          vf 2     link/ether de:5e:30:68:e6:b9 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust on
          altname enp59s0f0
      5: ens1f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
          link/ether 40:a6:b7:71:b2:a1 brd ff:ff:ff:ff:ff:ff
          vf 0     link/ether 36:d7:a1:46:9c:ab brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust on
          vf 1     link/ether b2:a3:a8:62:d7:bb brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust on
          vf 2     link/ether 96:45:7c:9b:97:93 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust on
          altname enp59s0f1
      10: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1400 qdisc noqueue master ovs-system state UP mode DEFAULT group default qlen 1000
          link/ether 9e:17:41:d9:da:b7 brd ff:ff:ff:ff:ff:ff
      11: ens1f0v0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
          link/ether 9e:17:41:d9:da:b7 brd ff:ff:ff:ff:ff:ff
          altname enp59s0f0v0
      14: ens1f0v0.477@ens1f0v0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1400 qdisc noqueue master bond0 state UP mode DEFAULT group default qlen 1000
          link/ether 9e:17:41:d9:da:b7 brd ff:ff:ff:ff:ff:ff
      16: ens1f1v0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
          link/ether 36:d7:a1:46:9c:ab brd ff:ff:ff:ff:ff:ff
          altname enp59s0f1v0
      17: ens1f1v0.477@ens1f1v0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1400 qdisc noqueue master bond0 state UP mode DEFAULT group default qlen 1000
          link/ether 9e:17:41:d9:da:b7 brd ff:ff:ff:ff:ff:ff
      21: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
          link/ether 9e:17:41:d9:da:b7 brd ff:ff:ff:ff:ff:ff
      
      
      #  oc get nns worker-0 -ojson -o=jsonpath='{.status.currentState.interfaces[?(@.name=="bond0")].link-aggregation.port}'
      ["ens1f0v0.477","ens1f1v0.477"]

      Version-Release number of selected component (if applicable):

      OCP 4.13.0
      openshift-nmstate 4.13.0-202305172315

      How reproducible:

      100%

      Steps to Reproduce:

      1. Remove one of interface vlans which is the slave of the bond interface
      
      apiVersion: nmstate.io/v1
      kind: NodeNetworkConfigurationPolicy
      metadata:
        name: remove-vlan 
      spec:
        nodeSelector:
          kubernetes.io/hostname: worker-1
        maxUnavailable: 3 
        desiredState:
          interfaces:
            - name: ens1f0v0.477
              type: vlan
              state: absent
              vlan:
                base-iface: ens1f0v0
                id: 477
      
      
      2. Create the same interface vlan but with different base-iface
      
      apiVersion: nmstate.io/v1
      kind: NodeNetworkConfigurationPolicy
      metadata:
        name: base-vlan-create-vlan 
      spec:
        nodeSelector:
          kubernetes.io/hostname: worker-1
        maxUnavailable: 3 
        desiredState:
          interfaces:
            - name: ens1f0v0.477
              type: vlan
              state: up
              vlan:
                base-iface: ens1f0v1
                id: 477
      
      
       3. Try to add new create interface vlan to the bond intrface
      
      apiVersion: nmstate.io/v1
      kind: NodeNetworkConfigurationPolicy
      metadata:
        name: base-vlan-bond
      spec:
        nodeSelector:
          kubernetes.io/hostname: worker-1
        maxUnavailable: 3 
        desiredState:
          interfaces:
            - name: bond0
              type: bond
              state: up
              link-aggregation:
                mode: active-backup
                options:
                  primary: ens1f0v0.477
                port:
                  - ens1f1v0.477
                  - ens1f0v0.477

      Actual results:

      1) Looks like all bond configuration  is removed and added again. I lost workers for a several mins
      
      worker-1   NotReady   worker                 3h4m    v1.26.3+b404935
      
      2) The Bond interface interface  has one slave interface
      
      oc get nns worker-1 -ojson -o=jsonpath='{.status.currentState.interfaces[?(@.name=="bond0")].link-aggregation.port}'
      ["ens1f1v0.477"]
      
      

      Expected results:

      Bond has 2 enslaved interfaces

      Additional info:

      failed nnce - http://pastebin.test.redhat.com/1101083
      journalctl - http://pastebin.test.redhat.com/1101085
      nmstate-handler - http://pastebin.test.redhat.com/1101086

              bnemec@redhat.com Benjamin Nemec
              rhn-cnf-elevin Evgeny Levin
              Evgeny Levin
              Carlos Goncalves
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

                Created:
                Updated:
                Resolved: