Uploaded image for project: 'Network Hardware Enablement'
  1. Network Hardware Enablement
  2. NHE-561

Fix broken flows for 2 Cluster Design DPU with BlueField-2

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Critical
    • None
    • openshift-4.14
    • DPU
    • None
    • NHE Sprint 235, NHE Sprint 236

    Description

      The following flows that have the failed tag below are the broken flows. They do not connect successfully.

      We see that the client syn messages are not acked by the server side.

      1-a: Pod to Pod (Same Node): 20.5Gbits/sec Pass 20.3Gbits/sec Pass
      1-b: Pod to Pod (Different Node): 20.3Gbits/sec Pass 22.3Gbits/sec Pass
      5-a: Pod -> NodePort Service traffic (Pod Backend - Same Node): 0 bits/sec Fail 0 bits/sec Fail
      5-b: Pod -> NodePort Service traffic (Pod Backend - Different Node): 0 bits/sec Fail 0 bits/sec Fail
      6-a: Pod -> NodePort Service traffic (Host Backend - Same Node): 0 bits/sec Fail 0 bits/sec Fail
      6-b: Pod -> NodePort Service traffic (Host Backend - Different Node): 0 bits/sec Fail 0 bits/sec Fail
      3-a: Pod -> Cluster IP Service traffic (Pod Backend - Same Node): 20.3Gbits/sec Pass 21.2Gbits/sec Pass
      3-b: Pod -> Cluster IP Service traffic (Pod Backend - Different Node): 20.0Gbits/sec Pass 22.1Gbits/sec Pass
      4-a: Pod -> Cluster IP Service traffic (Host Backend - Same Node): 0 bits/sec Fail 0 bits/sec Fail
      4-b: Pod -> Cluster IP Service traffic (Host Backend - Different Node): 0 bits/sec Fail 0 bits/sec Fail
      9-a: Host Pod -> Cluster IP Service traffic (Pod Backend - Same Node): 20.8Gbits/sec Pass 20.3Gbits/sec Pass
      9-b: Host Pod -> Cluster IP Service traffic (Pod Backend - Different Node): 20.0Gbits/sec Pass 22.1Gbits/sec Pass
      10-a: Host Pod -> Cluster IP Service traffic (Host Backend - Same Node): 0 bits/sec Fail 0 bits/sec Fail
      10-b: Host Pod -> Cluster IP Service traffic (Host Backend - Different Node): 0 bits/sec Fail 0 bits/sec Fail
      11-a: Host Pod -> NodePort Service traffic (Pod Backend - Same Node): 20.4Gbits/sec Pass 20.3Gbits/sec Pass
      11-b: Host Pod -> NodePort Service traffic (Pod Backend - Different Node): 0 bits/sec Fail 0 bits/sec Fail
      12-a: Host Pod -> NodePort Service traffic (Host Backend - Same Node): 0 bits/sec Fail 0 bits/sec Fail
      12-b: Host Pod -> NodePort Service traffic (Host Backend - Different Node): 0 bits/sec Fail 0 bits/sec Fail

      We are in shared GW mode in a 2 cluster design. Running the latest OVN-K from downstream https://github.com/openshift/ovn-kubernetes .

      This problem can be seen in OCP 4.13.

      Attachments

        Activity

          People

            wizhao@redhat.com William Zhao
            bnemeth@redhat.com Balazs Nemeth
            Salvatore Daniele
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: