Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-61034

OCP 4.16.46 Workers/Masters cannot cross-talk via pod network after upgrade

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not a Bug
    • Icon: Undefined Undefined
    • None
    • 4.16.z
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Critical
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      Primary problem: Openshift router-default pods (HostNetworked, running on worker nodes) cannot reach openshift-authentication (oauth) pods running on master nodes. (connection timeout).

      OVNKUBE-trace indicates successful flows in both directions.

      Replicated traffic flow blocking issue between all workers and all masters (vsphere infrastructure) on the same Vhost. 

      No Network Policies present, ovnkube DB rebuild on all hosts makes no difference. 

      Worker to worker is unobstructed, master to master is unobstructed, worker to master is blocked (or vice versa). See below for confirmed flow issues:

      cross-pools not OK:
      hostNetwork (worker) --> endpointIP (master) (fails)
      podNetwork (worker) --> endpointIP (master) (fails)
      hostNetwork (worker) --> ServiceIP --> endpoint(master) (succeeds) !!INTERESTING!!
      podNetwork (worker) --> serviceIP --> endpoint (master) (fails)
      ---
      
      #same pools are okay:
      HostNetwork (worker) --> endpointIP (worker) (success)
      PodNetwork (worker) --> endpointIP (worker) (success)
      HostNetwork (master) --> endpointIP (master) (success)
      PodNetwork (master) --> endpointIP (master) (success) 
      
      # host Net OKAY:
      worker PING or CURL to masterIP:6443 (success)
      Master PING or CURL to workerIP:443 (success)

       

      Version-Release number of selected component (if applicable):

      4.16.46

      How reproducible:

      • Continuously - ongoing in single customer environment, unable to replicate internally.

      Steps to Reproduce:

      //timeline of events:
       //TIMELINE OF EVENTS:
      
      4.12 cluster created (SDN)
      
      4.14 upgraded successfully
      
      SDN --> OVN (completed with some issues)
      
      ALL WORKER NODES REPLACED WITH REINSTALLED NODES WITH EXPANDED SUBNETS BECAUSE PREVIOUSLY THE CIDR WAS TOO SMALL. Rather than reinstalling the cluster, removed + installed a fresh replacement host one at a time + added it to the cluster with expanded subnet. (Masters were NOT replaced)
      
      (20d uptime healthy)
      
      5d back:
      4.14 --> 15 (masters only)
      
      4.15 --> 4.16 (masters and workers) [problem state started]
      
      
      

      Actual results:

      • Customer platform is degraded

      Expected results:

      Additional info:

      • Cloudpaks cluster, IBM support engaged
      • TCPDUMP has been pulled and we observe that in the failure condition state the SYN packet from the router pod (or client on the worker) is recieved by the target application oauth container and a syn/ack is generated. However, the syn/ack is NOT seen by the client side on any interface.
      • No NetworkPolicies in place
      • No nftable or IPtable rule modifications on any host
      • No Firewall between the nodes or traffic shaping has been observed
      • 6081/UDP traffic is unobstructed that we can tell. 
      • Ovnkube-traces are available (will upload separately) that indicate successful flow 
      • ovnkube db rebuilds or node reboots do not make a difference
      • ntp time is synced on hosts
      • all VMS have been moved to the same esxi hardware to confirm local routing is okay/unblocked (no change)

       

      The most unique behavior is that when we curl from the router pod (hostNet) to the SERVICE IP (internal) of the oauth pods in openshift-authentication, we get a 200. If we call the endpoint directly at the pod exposed port, we fail the call. 

      • question for engineering: What is different about this flow? We should be natting the request through the 100.88.0.0/16 or 100.64.0.0/16 subnet in either event, perhaps source nat being the difference - serviceIP would register to the pod as the client IP address for return path, and on a direct call from hostnet would register as the IP of the node?

      //analysis + suspicion:

      • I expect there is a problem with return pathing here where we're sending the packet out into the network caused by a configuration issue at the gateway/switch, but I can't rule out the possibility that there's a routing/handling issue in the ovnkube routing tables that requires a second opinion.

      //Data gathered + compiled and available in first private comment in the bug below. 

              bbennett@redhat.com Ben Bennett
              rhn-support-wrussell Will Russell
              None
              None
              Anurag Saxena Anurag Saxena
              None
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: