Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-1427

Ignore non-ready endpoints when processing endpointslices

    XMLWordPrintable

Details

    • Moderate
    • SDN Sprint 226, SDN Sprint 227
    • 2
    • Approved
    • False
    • Hide

      This seems to be a significant regression in ingress performance (10x) when using ovn-kubernetes, but there is no similar regression with openshift-sdn.

      Show
      This seems to be a significant regression in ingress performance (10x) when using ovn-kubernetes, but there is no similar regression with openshift-sdn.

    Description

      Description of problem:

      Jump looks the worst on gcp, but looking closer Azure and AWS both jumped as well just not as high.

      Disruption data indicates that the image registry on GCP was averaging around 30-40 seconds of disruption during an upgrade, until Aug 27th when it jumped to 125-135 seconds and has remained there ever since.

      We see similar spikes in ingress-to-console and ingress-to-oauth. NOTE: image registry backend is also behind ingress, so all three are ingress related disruption.

      https://datastudio.google.com/s/uBC4zuBFdTE

      These charts show the problem on Aug 27 for registry, ingress to console, and ingress to oauth.

      sdn network type appears unaffected.

      Something merged Aug 26-27 that caused a significant change for anything behind ingress using ovn on gcp.

      Attachments

        Issue Links

          Activity

            People

              rravaiol@redhat.com Riccardo Ravaioli
              rhn-engineering-dgoodwin Devan Goodwin
              Anurag Saxena Anurag Saxena
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: