Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-29012

Azure Service Load Balancer taking long time to get deleted 4.15 and 4.16

XMLWordPrintable

    • No
    • CLOUD Sprint 249, CLOUD Sprint 250, CLOUD Sprint 251, CLOUD Sprint 252, CLOUD Sprint 253
    • 5
    • Proposed
    • False
    • Hide

      None

      Show
      None
    • Hide
      In certain cases, connections to the Azure API would hang for extended periods, up to 16 minutes. No client timeout was set on the API calls from the cloud provider code.
      Timeouts have now been included to prevent hanging API calls.
      Show
      In certain cases, connections to the Azure API would hang for extended periods, up to 16 minutes. No client timeout was set on the API calls from the cloud provider code. Timeouts have now been included to prevent hanging API calls.
    • Bug Fix
    • In Progress

      Description of problem:

          We see failures in this test:
      
      [Jira:"Networking / router"] monitor test service-type-load-balancer-availability setup expand_less 15m1s{ failed during setup error waiting for load balancer: timed out waiting for service "service-test" to have a load balancer: timed out waiting for the condition}
      
      See this https://search.ci.openshift.org/?search=error+waiting+for+load+balancer&maxAge=168h&context=1&type=bug%2Bissue%2Bjunit&name=&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job to find recent ones.
      
      example job: https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-ci-4.15-upgrade-from-stable-4.14-e2e-azure-sdn-upgrade/1754402739040817152
      
      this has failed payloads like:
      
      https://amd64.ocp.releases.ci.openshift.org/releasestream/4.16.0-0.ci/release/4.16.0-0.ci-2024-02-01-211543
      https://amd64.ocp.releases.ci.openshift.org/releasestream/4.15.0-0.ci/release/4.15.0-0.ci-2024-02-02-061913
      https://amd64.ocp.releases.ci.openshift.org/releasestream/4.15.0-0.ci/release/4.15.0-0.ci-2024-02-02-001913

      Version-Release number of selected component (if applicable):

          4.15 and 4.16

      How reproducible:

          intermittent as shown in the search.ci query above

      Steps to Reproduce:

          1. run the e2e tests on 4.15 and 4.16
          2.
          3.
          

      Actual results:

          timeouts on getting load balancer

      Expected results:

          no timeout and successful load balancer

      Additional info:

          https://issues.redhat.com/browse/TRT-1486 has more info 
      thread: https://redhat-internal.slack.com/archives/C01CQA76KMX/p1707142256956139

            joelspeed Joel Speed
            dperique@redhat.com Dennis Periquet
            Zhaohua Sun Zhaohua Sun
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated: