Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-27852

ovnkube-controller bug: ovn service lb still has the endpoint when pod is in terminating state

XMLWordPrintable

    • +
    • Important
    • No
    • 5
    • SDN Sprint 248, SDN Sprint 249, SDN Sprint 250, SDN Sprint 251, SDN Sprint 252, SDN Sprint 253
    • 6
    • False
    • Hide

      None

      Show
      None
    • Hide
      * Previously, if traffic was forwarded to terminating endpoints that were not functioning, communications problems occured unless the readiness probes on these endpoints were configured to quickly flag the endpoints as not serving. This occurred because the the endpoint selection for services partially implemented KEP-1669 `ProxyTerminatingEndpoints` traffic to services inside the {product-title} cluster. As a result, this traffic was forwarded to all endpoints that were either ready, such as `ready=true`, `serving=true`, `terminating=false`, or terminating and serving, such as `ready=false`, `serving=true`, `terminating=true`. This caused communication issues when traffic was forwarded to terminating endpoints and the readiness probes on these endpoints were not configured to quickly flag the endpoints as not serving, `serving=false`, when they were no longer functional. With this release, the endpoints selection logic now fully implements KEP-1669 `ProxyTerminatingEndpoints` for any given service so that all ready endpoints are selected. If no ready endpoints are found, functional terminating and serving endpoints are used.(link:https://issues.redhat.com/browse/OCPBUGS-27852[*OCPBUGS-27852*])
      Show
      * Previously, if traffic was forwarded to terminating endpoints that were not functioning, communications problems occured unless the readiness probes on these endpoints were configured to quickly flag the endpoints as not serving. This occurred because the the endpoint selection for services partially implemented KEP-1669 `ProxyTerminatingEndpoints` traffic to services inside the {product-title} cluster. As a result, this traffic was forwarded to all endpoints that were either ready, such as `ready=true`, `serving=true`, `terminating=false`, or terminating and serving, such as `ready=false`, `serving=true`, `terminating=true`. This caused communication issues when traffic was forwarded to terminating endpoints and the readiness probes on these endpoints were not configured to quickly flag the endpoints as not serving, `serving=false`, when they were no longer functional. With this release, the endpoints selection logic now fully implements KEP-1669 `ProxyTerminatingEndpoints` for any given service so that all ready endpoints are selected. If no ready endpoints are found, functional terminating and serving endpoints are used.(link: https://issues.redhat.com/browse/OCPBUGS-27852 [* OCPBUGS-27852 *])
    • Bug Fix
    • Done
    • Customer Escalated
    • OVNK sending traffic to terminating pods

      Description of problem:

      The users are experiencing an issue with NodePort traffic forwarding, where the TCP traffic continues to be directed to pods which are under terminating state, the connection cannot be created sucessfully, as per the customer mentioned this issue is causing the connection disruptions in the business transaction.

       

      Version-Release number of selected component (if applicable):

      On the OpenShift 4.12.13 with RHEL8.6 workers and OVN environment.

       

      How reproducible:

      here is the code found.
      https://github.com/openshift/ovn-kubernetes/blob/dd3c7ed8c1f41873168d3df26084ecbfd3d9a36b/go-controller/pkg/util/kube.go#L360

      func IsEndpointServing(endpoint discovery.Endpoint) bool {
              if endpoint.Conditions.Serving != nil

      {                 return *endpoint.Conditions.Serving         }

      else

      {                 return IsEndpointReady(endpoint)         }

      }

      // IsEndpointValid takes as input an endpoint from an endpoint slice and a boolean that indicates whether to include
      // all terminating endpoints, as per the PublishNotReadyAddresses feature in kubernetes service spec. It always returns true
      // if includeTerminating is true and falls back to IsEndpointServing otherwise.
      func IsEndpointValid(endpoint discovery.Endpoint, includeTerminating bool) bool

      {         return includeTerminating || IsEndpointServing(endpoint) }

      Look like 'IsEndpointValid' function will retrun serving=true endpoint, it not checking the ready=true endpoint
      I see recently the code has been changed in this section(look up Ready=true is changed to Serving=true)?

      [Check the "Serving" field for endpoints]
      https://github.com/openshift/ovn-kubernetes/commit/aceef010daf0697fe81dba91a39ed0fdb6563dea#diff-daf9de695e0ff81f9173caf83cb88efa138e92a9b35439bd7044aa012ff931c0

      https://github.com/openshift/ovn-kubernetes/blob/release-4.12/go-controller/pkg/util/kube.go#L326-L386

                  out.Port = *port.Port
                  for _, endpoint := range slice.Endpoints {
                      // Skip endpoint if it's not valid
                      if !IsEndpointValid(endpoint, includeTerminating)

      {                     klog.V(4).Infof("Slice endpoint not valid")                     continue                 }

                      for _, ip := range endpoint.Addresses {
                          klog.V(4).Infof("Adding slice %s endpoint: %v, port: %d", slice.Name, endpoint.Addresses, *port.Port)
                          ipStr := utilnet.ParseIPSloppy(ip).String()
                          switch slice.AddressType

      {                     case discovery.AddressTypeIPv4:                         v4ips.Insert(ipStr)                     case discovery.AddressTypeIPv6:                         v6ips.Insert(ipStr)                     default:                         klog.V(5).Infof("Skipping FQDN slice %s/%s", slice.Namespace, slice.Name)                     }

                      }
                  }

      Steps to Reproduce:

      Here is the customer's sample pods for you refering.
      mbgateway-st-8576f6f6f8-5jc75   1/1     Running   0          104m    172.30.195.124   appn01-100.app.paas.example.com   <none>           <none>
      mbgateway-st-8576f6f6f8-q8j6k   1/1     Running   0          5m51s   172.31.2.97      appn01-202.app.paas.example.com   <none>           <none>

      pod yaml:
          livenessProbe:
            failureThreshold: 3
            initialDelaySeconds: 40
            periodSeconds: 10
            successThreshold: 1
            tcpSocket:
              port: 9190
            timeoutSeconds: 5
          name: mbgateway-st
          ports:
          - containerPort: 9190
            protocol: TCP
          readinessProbe:
            failureThreshold: 3
            initialDelaySeconds: 40
            periodSeconds: 10
            successThreshold: 1
            tcpSocket:
              port: 9190
            timeoutSeconds: 5
          resources:
            limits:
              cpu: "2"
              ephemeral-storage: 10Gi
              memory: 2G
            requests:
              cpu: 50m
              ephemeral-storage: 100Mi
              memory: 1111M

      when delete pod Pod(mbgateway-st-8576f6f6f8-5jc75), check the EndpointSlice status:
      addressType: IPv4
      apiVersion: discovery.k8s.io/v1
      endpoints:

      • addresses:
          - 172.30.195.124
          conditions:
            ready: false
            serving: true
            terminating: true
          nodeName: appn01-100.app.paas.example.com
          targetRef:
            kind: Pod
            name: mbgateway-st-8576f6f6f8-5jc75
            namespace: lb59-10-st-unigateway
            uid: 5e8a375d-ba56-4894-8034-0009d0ab8ebe
          zone: AZ61QEBIZ_AZ61QEM02_FD3
      • addresses:
          - 172.31.2.97
          conditions:
            ready: true
            serving: true
            terminating: false
          nodeName: appn01-202.app.paas.example.com
          targetRef:
            kind: Pod
            name: mbgateway-st-8576f6f6f8-q8j6k
            namespace: lb59-10-st-unigateway
            uid: 5bd195b7-e342-4b34-b165-12988a48e445
          zone: AZ61QEBIZ_AZ61QEM02_FD1

      Wait for a little moment, try to check Ovn Service lb, it found the endpoints information doesn't update to the latest.
      9349d703-1f28-41fe-b505-282e8abf4c40    Service_lb59-10-    tcp        172.35.0.185:31693      172.30.195.124:9190,172.31.2.97:9190
      dca65745-fac4-4e73-b412-2c7530cf4a91    Service_lb59-10-    tcp        172.35.0.170:31693      172.30.195.124:9190,172.31.2.97:9190
      a5a65766-b0f2-4ac6-8f7c-cdebeea303e3    Service_lb59-10-    tcp        172.35.0.89:31693       172.30.195.124:9190,172.31.2.97:9190
      a36517c5-ecaa-4a41-b686-37c202478b98    Service_lb59-10-    tcp        172.35.0.213:31693      172.30.195.124:9190,172.31.2.97:9190
      16d997d1-27f0-41a3-8a9f-c63c8872d7b8    Service_lb59-10-    tcp        172.35.0.92:31693       172.30.195.124:9190,172.31.2.97:9190

      Wait for a little moment,
      addressType: IPv4
      apiVersion: discovery.k8s.io/v1
      endpoints:

      • addresses:
          - 172.30.195.124
          conditions:
            ready: false
            serving: true
            terminating: true
          nodeName: appn01-100.app.paas.example.com
          targetRef:
            kind: Pod
            name: mbgateway-st-8576f6f6f8-5jc75
            namespace: lb59-10-st-unigateway
            uid: 5e8a375d-ba56-4894-8034-0009d0ab8ebe
          zone: AZ61QEBIZ_AZ61QEM02_FD3
      • addresses:
          - 172.31.2.97
          conditions:
            ready: true
            serving: true
            terminating: false
          nodeName: appn01-202.app.paas.example.com
          targetRef:
            kind: Pod
            name: mbgateway-st-8576f6f6f8-q8j6k
            namespace: lb59-10-st-unigateway
            uid: 5bd195b7-e342-4b34-b165-12988a48e445
          zone: AZ61QEBIZ_AZ61QEM02_FD1
      • addresses:
          - 172.30.132.78
          conditions:
            ready: false
            serving: false
            terminating: false
          nodeName: appn01-089.app.paas.example.com
          targetRef:
            kind: Pod
            name: mbgateway-st-8576f6f6f8-8lp4s
            namespace: lb59-10-st-unigateway
            uid: 755cbd49-792b-4527-b96a-087be2178e9d
          zone: AZ61QEBIZ_AZ61QEM02_FD3

      check Ovn Service lb, it found the Pod Endpoint information is still here:
      fceeaf8f-e747-4290-864c-ba93fb565a8a    Service_lb59-10-    tcp        172.35.0.56:31693       172.30.132.78:9190,172.30.195.124:9190,172.31.2.97:9190
      bef42efd-26db-4df3-b99d-370791988053    Service_lb59-10-    tcp        172.35.1.26:31693       172.30.132.78:9190,172.30.195.124:9190,172.31.2.97:9190
      84172e2c-081c-496a-afec-25ebcb83cc60    Service_lb59-10-    tcp        172.35.0.118:31693      172.30.132.78:9190,172.30.195.124:9190,172.31.2.97:9190
      34412ddd-ab5c-4b6b-95a3-6e718dd20a4f    Service_lb59-10-    tcp        172.35.1.14:31693       172.30.132.78:9190,172.30.195.124:9190,172.31.2.97:9190

       

      Actual results:

      Service LB endpoint determines on the POD.status.condition[type=Serving] status.

      Expected results:

      Service LB endpoint should determines on the POD.status.condition[type=Ready] status.

       

      Additional info:

      The ovn-controller determines whether an endpoint should be added to the Service Load Balancer (serviceLB) based on the condition.serving. The current issue is that when a pod is in the terminating state, the condition.serving remains true. Its state determines on the POD.status.condition[type=Ready] status is being true.

      However when a pod is deleted, the endpointslice condition.serving state remains unchanged, and the backend pool of the service LB still includes the IP information of the deleted pod.Why doesn't ovn-controller use the condition.ready status to decide whether the pod's IP should be added to the service LB backend pool?

      Could the shift-networking experts confirm whether this is the openshift ovn service lb bug or not?

            rravaiol@redhat.com Riccardo Ravaioli
            rhn-support-jiewu Jie Wu
            Jean Chen Jean Chen
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: