Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-56095

Change the OVN localnet to absent causes leftover OVS bridge

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Major Major
    • None
    • None
    • NetworkManager
    • None
    • None
    • rhel-sst-network-management
    • ssg_networking
    • None
    • Hide

       

      Customer/Partner Jira ID Customer case Status Details
      EMIRATES NBD BANK RHEL-56095 03877931 The customer is encountering an issue where the OVN localnet interface and its associated OVS bridge were not fully removed after setting the configuration to absent, which caused the NNCP to fail. The root cause is linked to a broader bug (RHEL-50747) where OVS interface remain even after NetworkManager connections are deleted. The next step is for the team to discuss RHEL-50747 and find the best way forward to provide a fix, resolving this issue seen by the customer. 
       [2024-09-09] The team has refined RHEL-50747 and the work on fixing the bug is planned to be started during the next sprint. This will resolve the current issue seen by the customer. 
       [2024-09-16] RHEL-50747 was not added initially into the sprint but is currently agreed to be worked on as soon as an engineer has capacity during the sprint period. 
      [2024-09-23] RHEL-50747 is now being worked on in the current sprint to fix the interface creation after its connection has been deleted.
      [2024-09-30] RHEL-50747 fix is now merged upstream and will be available on the next NetworkManager build.  The fix will be backported into RHEL-9.2 targeting the next batch update (2024-11-12)

       

      Show
        Customer/Partner Jira ID Customer case Status Details EMIRATES NBD BANK RHEL-56095 03877931 The customer is encountering an issue where the OVN localnet interface and its associated OVS bridge were not fully removed after setting the configuration to absent, which caused the NNCP to fail. The root cause is linked to a broader bug ( RHEL-50747 ) where OVS interface remain even after NetworkManager connections are deleted. The next step is for the team to discuss RHEL-50747 and find the best way forward to provide a fix, resolving this issue seen by the customer.    [2024-09-09] The team has refined RHEL-50747 and the work on fixing the bug is planned to be started during the next sprint. This will resolve the current issue seen by the customer.    [2024-09-16] RHEL-50747 was not added initially into the sprint but is currently agreed to be worked on as soon as an engineer has capacity during the sprint period.  [2024-09-23] RHEL-50747 is now being worked on in the current sprint to fix the interface creation after its connection has been deleted. [2024-09-30] RHEL-50747 fix is now merged upstream and will be available on the next NetworkManager build.  The fix will be backported into RHEL-9.2 targeting the next batch update (2024-11-12)  
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • Hide

      Given an OVN localnet interface and OVS bridge have been configured and are active on a node,

      When the system administrator changes the NNCP to set the OVN localnet interface and OVS bridge to absent,
      Then, the OVS bridge associated with the localnet interface must be completely removed from the system and the NNCE resource should not report a FailedToConfigure status. 

      Definition of Done:

      • The implementation meets the acceptance criteria
      • Integration tests are written and pass
      • The code is part of a downstream build attached to an errata
      Show
      Given an OVN localnet interface and OVS bridge have been configured and are active on a node, When the system administrator changes the NNCP to set the OVN localnet interface and OVS bridge to absent, Then, the OVS bridge associated with the localnet interface must be completely removed from the system and the NNCE resource should not report a FailedToConfigure status.  Definition of Done: The implementation meets the acceptance criteria Integration tests are written and pass The code is part of a downstream build attached to an errata
    • None
    • None
    • None

      Description of problem:

      The ovn localnet interface configured by nmstate nncp does not appear to be removed successfully. This seems happening more on nodes where there are pods with net-attach-def referencing the OVN localnet interface. Once the applied configuration is changed to absent status, the nnce resources change the status to failedtoconfigure and report the following messages:
      
      
            WARN  nmstate::ovsdb::show] Unknown OVS interface type \n[2024-08-19T12:18:04Z
            WARN  nmstate::ovsdb::show] Unknown OVS interface type \n[2024-08-19T12:18:04Z
            WARN  nmstate::ovsdb::show] Unknown OVS interface type \n[2024-08-19T12:18:04Z
            WARN  nmstate::ovsdb::show] Unknown OVS interface type \n[2024-08-19T12:18:04Z
            INFO  nmstate::nm::show] Got unsupported interface type generic: genev_sys_6081,
            ignoring\n[2024-08-19T12:18:04Z ERROR nmstate::query_apply::inter_ifaces] VerificationError:
            Absent/Down interface bridge-name/ovs-bridge still found as OvsBridge(OvsBridgeInterface
            { base: BaseInterface { name: \"bridge-name\", profile_name: None, description:
       
      
      

      Version-Release number of selected component (if applicable):

          OCP 4.15 

      How reproducible:

          Easily

      Steps to Reproduce:

      1. Apply the following configuration to the platform: 
      
      
      # oc get nncp  ovs-localnet-rep -oyaml
      apiVersion: nmstate.io/v1
      kind: NodeNetworkConfigurationPolicy
      metadata:
        annotations:
        name: ovs-localnet-rep
      spec:
        desiredState:
          interfaces:
          - bridge:
              allow-extra-patch-ports: true
              options:
                stp: true
              port:
              - name: bond1
            ipv4:
              enabled: false
            name: newbr
            state: up
            type: ovs-bridge
          ovn:
            bridge-mappings:
            - bridge: newbr
              localnet: nad-test
              state: present
        nodeSelector:
          nmstate: test
      
      # oc get nncp  bondreproducer -oyaml
      apiVersion: nmstate.io/v1
      kind: NodeNetworkConfigurationPolicy
      metadata:
        annotations:
        name: bondreproducer
      spec:
        desiredState:
          interfaces:
          - link-aggregation:
              mode: active-backup
              port:
              - ensxxx
              - ensxxx
            name: bond1
            state: up
            type: bond
        nodeSelector:
          nmstate: test
      
      2. Check if the configuration is going to be applieds to the respective nodes: 
      
      # oc get nnce
      NAME                                                          STATUS      STATUS AGE   REASON
      master1.bm-upi.<test>.bondreproducer   Available   31m          SuccessfullyConfigured
      master1.bm-upi.<test>.ovs-localnet-rep     Available   29m          SuccessfullyConfigured
      master2.bm-upi.<test>.bondreproducer   Available   87m          SuccessfullyConfigured
      master2.bm-upi.<test>.ovs-localnet-rep     Available   29m          SuccessfullyConfigured
      
      3. Change the nncp configuration from bridge and localnet interface to absent: 
      
      
       # oc get nncp  ovs-localnet-rep -oyaml
      apiVersion: nmstate.io/v1
      kind: NodeNetworkConfigurationPolicy
      metadata:
        annotations:
        name: ovs-localnet-rep
      spec:
        desiredState:
          interfaces:
          - bridge:
              allow-extra-patch-ports: true
              options:
                stp: true
              port:
              - name: bond1
            ipv4:
              enabled: false
            name: newbr
            state: absent <------
            type: ovs-bridge
          ovn:
            bridge-mappings:
            - bridge: newbr
              localnet: nad-test
              state: absent <-----
        nodeSelector:
          nmstate: test
      
       
      4. Confirm that the nnce changed the status to failedtoconfigure:
      
      # oc get nnce master2.bm-upi.brunolabocp.local.ovs-localnet-rep03877931
      NAME                                                        STATUS    STATUS AGE   REASON
      master2.bm-upi.<test>.ovs-localnet-rep   Failing   33s          FailedToConfigure
      
      5. Confirm that there is any pod referencing the OVN localnet interface by a net-attach-def resource. The issue seems happening more on node where there are pods with net-attach-def referenced: 
      
      $ oc get net-attach-def -oyaml 
      
          config: |-
            {
              "cniVersion": "0.3.1",
              "name": "nad-test",
              "type": "ovn-k8s-cni-overlay",
              "topology":"localnet",
              "subnets": "xx.xx.xx.0/24",
              "mtu": 1300,
              "netAttachDefName": "test-nad/nad-test"
            }
      
      $ oc get pod -oyaml 
      
      apiVersion: v1
      kind: Pod
      metadata:
        annotations:
          k8s.v1.cni.cncf.io/network-status: |-
            [{
                "name": "test-nad/nad-test",
                "interface": "net1",
                "ips": [
                    "xx.xx.xx.51"
                ],
                "mac": "xx:xx:xx:xx:xx:xx",
                "dns": {}
            }]
      
      6. If the bridge is removed from OVS and the configuration is reapplied, the issue is resolved: 
      
      $ oc debug node/<node-name>
      $ chroot /host
      sh-5.1# ovs-vsctl list-br <--- find the bridge in the configuration
      sh-5.1# ovs-vsctl del-br <bridge-name> <---- Delete the bridge
      
      7. Force a nmstate reapply configuration
      
      

      Actual results:

          The following issue is reported by the nnce resource and the configuration is not applied: 
      
          WARN  nmstate::ovsdb::show] Unknown OVS interface type \n[2024-08-19T12:18:04Z
            WARN  nmstate::ovsdb::show] Unknown OVS interface type \n[2024-08-19T12:18:04Z
            WARN  nmstate::ovsdb::show] Unknown OVS interface type \n[2024-08-19T12:18:04Z
            WARN  nmstate::ovsdb::show] Unknown OVS interface type \n[2024-08-19T12:18:04Z
            INFO  nmstate::nm::show] Got unsupported interface type generic: genev_sys_6081,
            ignoring\n[2024-08-19T12:18:04Z ERROR nmstate::query_apply::inter_ifaces] VerificationError:
            Absent/Down interface bridge-name/ovs-bridge still found as OvsBridge(OvsBridgeInterface
      
      

       

      Expected results:

          It is expected that the configuration finishes successfully and the interface is removed. When this issue is combined with the bug RHEL-40683, the configuration is reapplied every time that the node label is changed. This often causes failedtoconfigure in the platform.

      Additional info:

          

              fge@redhat.com Gris Ge
              rhn-support-bgomes Bruno Gomes
              Network Management Team Network Management Team
              Qiong Wang Qiong Wang
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

                Created:
                Updated:
                Resolved: