Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-42650

[OCP 4.15] Pod to Pod communication failing, failed (Invalid argument) on packet

XMLWordPrintable

    • None
    • False
    • Hide

      None

      Show
      None
    • 10/08 Single node failing pod to api; Sev 3 case

      Description of problem:

         Pod to pod communication timing out happening only on one node of a cluster.

         Initial issue happened when setting up the nvidia-driver-daemonset

         Not all pods are affected as "openshift-network-diagnostics" pods running on that host seems to work, but others are failing.

         All fails with error: 
      dial tcp 172.30.0.1:443: i/o timeout
      Version-Release number of selected component (if applicable):

      Openshift 4.15.28

      How reproducible:

      Seems always reproducible in that specific node

      Steps to Reproduce:

      1.Deploy nvidia-driver-daemonset

      2.

      3.

      Actual results:

        Only observed error in that node so far is:
        

      $ less openvswitch/journalctl_--no-pager_--unit_ovs-vswitchd
      ...
      Aug 08 04:11:43 node.cluster.example.com ovs-vswitchd[3116]: ovs|00002|dpif(handler416)|WARN|system@ovs-system: execute ct(commit,zone=111,mark=0/0x1,nat(src)),ct(zone=42,nat),recirc(0x11a957) failed (Invalid argument) on packet tcp,vlan_tci=0x0000,dl_src=0a:58:xx:yy:zz:e7,dl_dst=0a:58:xx:yy:zz:18,nw_src=10.xxx.17.231,nw_dst=10.xxx.16.24,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=8140,tp_dst=40832,tcp_flags=psh|ack tcp_csum:6a30 

      Expected results:

      No error

      Additional info:

      This is a baremetal node with GPU, but is not the only one, there are other 2 that have are part of a different machine-config-pool and doesn't have any reported issue.

      Affected Platforms:
      Agnostic cluster with virtualized and baremetal nodes

              sdn-team-bot sdn-team bot
              rhn-support-mabajodu Mario Abajo Duran
              Anurag Saxena Anurag Saxena
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated: