Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-12001

OpenshiftSDN keeps accepting connections in service ip without available endpoints

XMLWordPrintable

    • Moderate
    • No
    • SDN Sprint 235
    • 1
    • Rejected
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      OpenshiftSDN network plugin does not reset the tcp session when receiving connections to the service resource ip without available endpoints. The same testing with OVN-Kubenertes returns different results, in this case, the OVNK reset the connections.

      Version-Release number of selected component (if applicable):

       

      How reproducible:

      Configure a testing deployment in the cluster and decrease the replicas to zero during the curl testing

      Steps to Reproduce:

      1. Execute the script below
      #!/bin/bash
      
      oc new-project test-ep
      oc project test-ep 
      oc new-app --name client-ep --image registry.redhat.io/rhel8/httpd-24:1-248
      oc new-app --name server-ep --image registry.redhat.io/rhel8/httpd-24:1-248
      sleep 15
      oc rsh $(oc get pod | grep -i client-ep | awk '{print $1}')
      while true; do time curl -vI server-ep:8080; sleep 1; done 
      
      2. Open a second terminal 
      3. Decrease the replicas from deployment to zero (oc scale deployment server-ep --replicas 0)
      4. Check the results in the first terminal

      Actual results:

      The connection timing out in 2 minutes due timeout of the curl command, but in the OVN-K this behavior is different. The network plugin resets the connection from the client immediately. 
      
      * Rebuilt URL to: server-ep:8080/
      *   Trying 172.30.56.50...
      * TCP_NODELAY set
      * connect to 172.30.56.50 port 8080 failed: Connection timed out
      * Failed to connect to server-ep port 8080: Connection timed out
      * Closing connection 0
      curl: (7) Failed to connect to server-ep port 8080: Connection timed out
      
      real	2m9.071s <--- Waited 2 min until timing out due to curl timeout
      user	0m0.007s
      sys	0m0.004s

      Expected results:

      Reset messages from the network plugin to avoid the attempts of connection.
      
      Same testing with OVN-Kubernetes
      * Rebuilt URL to: server-ep:8080/
      *   Trying 172.30.43.148...
      * TCP_NODELAY set
      * Connected to server-ep (172.30.43.148) port 8080 (#0)
      > HEAD / HTTP/1.1
      > Host: server-ep:8080
      > User-Agent: curl/7.61.1
      > Accept: */*
      > 
      * Recv failure: Connection reset by peer
      * Closing connection 0
      curl: (56) Recv failure: Connection reset by peer <<<<---- Connection reset
      
      real	0m0.050s
      user	0m0.005s
      sys	0m0.002s

      Additional info:

      See the comparison from OpenshiftSDN and OVN-Kubernetes captured packets: 
      
      OpenshiftSDN
      Comments: Check that we dont see any reset message after the deployment scale down to 0
      
      09:02:12.510203 0a:58:0a:80:02:53 > 7e:d8:f2:c4:39:db, ethertype IPv4 (0x0800), length 74: 10.128.2.83.37900 > 172.30.56.50.webcache: Flags [S], seq 449247158, win 28200, options [mss 1410,sackOK,TS val 2228507017 ecr 0,nop,wscale 7], length 0
      09:02:13.539377 0a:58:0a:80:02:53 > 7e:d8:f2:c4:39:db, ethertype IPv4 (0x0800), length 74: 10.128.2.83.37900 > 172.30.56.50.webcache: Flags [S], seq 449247158, win 28200, options [mss 1410,sackOK,TS val 2228508047 ecr 0,nop,wscale 7], length 0
      09:02:15.588385 0a:58:0a:80:02:53 > 7e:d8:f2:c4:39:db, ethertype IPv4 (0x0800), length 74: 10.128.2.83.37900 > 172.30.56.50.webcache: Flags [S], seq 449247158, win 28200, options [mss 1410,sackOK,TS val 2228510096 ecr 0,nop,wscale 7], length 0
      09:02:19.619569 0a:58:0a:80:02:53 > 7e:d8:f2:c4:39:db, ethertype IPv4 (0x0800), length 74: 10.128.2.83.37900 > 172.30.56.50.webcache: Flags [S], seq 449247158, win 28200, options [mss 1410,sackOK,TS val 2228514127 ecr 0,nop,wscale 7], length 0
      09:02:28.004368 0a:58:0a:80:02:53 > 7e:d8:f2:c4:39:db, ethertype IPv4 (0x0800), length 74: 10.128.2.83.37900 > 172.30.56.50.webcache: Flags [S], seq 449247158, win 28200, options [mss 1410,sackOK,TS val 2228522512 ecr 0,nop,wscale 7], length 0
      
      
      OVN-Kubernetes (Pod 10.131.0.6 / svc ip 172.30.8.188)
      Comments: Check that we can see messages of reset with the source 172.30.8.188 after the deployment scale down to 0
      
      15:21:19.929812 IP 172.30.0.10.domain > 10.131.0.6.52913: 12070*- 0/1/0 (250)
      15:21:19.930091 IP 10.131.0.6.37224 > 172.30.8.188.webcache: Flags [S], seq 2722339236, win 27200, options [mss 1360,sackOK,TS val 3083358821 ecr 0,nop,wscale 7], length 0
      15:21:20.961668 IP 10.131.0.6.37224 > 172.30.8.188.webcache: Flags [S], seq 2722339236, win 27200, options [mss 1360,sackOK,TS val 3083359853 ecr 0,nop,wscale 7], length 0
      15:21:20.962354 IP 172.30.8.188.webcache > 10.131.0.6.37224: Flags [R.], seq 0, ack 2722339237, win 0, length 0
      15:21:21.972754 IP 10.131.0.6.35870 > 172.30.0.10.domain: 63652+ A? server-ep.test-ep.svc.cluster.local. (106)
      15:21:21.972808 IP 10.131.0.6.35870 > 10.131.0.3.mdns: 63652+ A (QM)? server-ep.test-ep.svc.cluster.local. (106)
      15:21:21.973083 IP 10.131.0.3.mdns > 10.131.0.6.35870: 32679*- 0/1/0 (250)
      15:21:21.973133 IP 172.30.0.10.domain > 10.131.0.6.35870: 32679*- 0/1/0 (250)
      15:21:21.973335 IP 10.131.0.6.37232 > 172.30.8.188.webcache: Flags [S], seq 2554820214, win 27200, options [mss 1360,sackOK,TS val 3083360864 ecr 0,nop,wscale 7], length 0
      15:21:21.973957 IP 172.30.8.188.webcache > 10.131.0.6.37232: Flags [R.], seq 0, ack 2554820215, win 0, length 0
      15:21:22.981968 IP 10.131.0.6.40616 > 172.30.0.10.domain: 14449+ A? server-ep.test-ep.svc.cluster.local. (106)
      15:21:22.984636 IP 10.131.0.6.40616 > 10.131.0.3.mdns: 14449+ A (QM)? server-ep.test-ep.svc.cluster.local. (53)
      15:21:22.984679 IP 10.131.0.6.40616 > 10.131.0.3.mdns: 34418+ AAAA (QM)? server-ep.test-ep.svc.cluster.local. (53)
      15:21:22.984874 IP 10.131.0.3.mdns > 10.131.0.6.40616: 34418*- 0/1/0 (250)
      15:21:22.984902 IP 172.30.0.10.domain > 10.131.0.6.40616: 34418*- 0/1/0 (250)
      15:21:22.985569 IP 10.131.0.6.37242 > 172.30.8.188.webcache: Flags [S], seq 3621835632, win 27200, options [mss 1360,sackOK,TS val 3083361877 ecr 0,nop,wscale 7], length 0
      15:21:22.986048 IP 172.30.8.188.webcache > 10.131.0.6.37242: Flags [R.], seq 0, ack 3621835633, win 0, length 0
      

       

       

            mkennell@redhat.com Martin Kennelly
            rhn-support-bgomes Bruno Gomes
            Zhanqi Zhao Zhanqi Zhao
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: