Uploaded image for project: 'OpenShift Service Mesh'
  1. OpenShift Service Mesh
  2. OSSM-3391

[OSSM] stickiness on remote endpoint with federated service without circuit breaking

XMLWordPrintable

    • Icon: Ticket Ticket
    • Resolution: Obsolete
    • Icon: Major Major
    • None
    • OSSM 2.3.1
    • Customer Impact, Maistra
    • None
    • False
    • None
    • False

      What problem/issue/behavior are you having trouble with? What do you expect to see?
      We run OSSM operator 2.3.1 on Openshift 4.12

      We created a basic federation hello-world scenario very much like the one described in the doc (https://docs.openshift.com/container-platform/4.12/service_mesh/v2x/ossm-federation.html): red-mesh exporting a service to green-mesh via

      {Exported,Imported}

      ServiceSet.

      Then, we tried the following experiment:
      1) scale the red webserver deployment to 2 replicas
      2) start injecting traffic from a single green client pod onto the imported red service
      3) we observe that the traffic is load-balanced 50/50 on the 2 red pods
      4) we change the HTTP response code from 200 to 503 on only one red webserver pod #1
      5) we observe that all the traffic is now going only to webserver pod #2
      6) we change back the HTTP response code from 503 to 200 on red webserver pod #1
      7) we observe that all the traffic is still going only to webserver pod #2
      8) we delete either the green client injector pod or the green egress
      9) we observe that the traffic is again load-balanced 50/50 on the 2 red pods

      Is this normal/expected behaviour? If so, is there a way to find out where this sort of "destination pod stickiness" information is persisted at runtime (we could not find it by querying the envoy config of the client proxy nor the egress) and can this be disabled?

      Notice that here we did not define a DestinationRule with outlierDetection on the green mesh.

      What is the business impact? Please also provide timeframe information.
      We are experimenting with OSSM with the intention of using it to potentially power new use-cases in production soon

      This issue is also captured in https://access.redhat.com/support/cases/#/case/03436396 and doc https://docs.google.com/document/d/1673j_r6V1XXS-TNRsviCO9D_CT4nZQOmJRPz6h-TN8Q/edit

            rh-ee-efenness Eoin Fennessy
            rhn-support-sappleton Shaun Appleton
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: