Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-13366

DNS operator prone to spamming TopologyAwareHintsDisable events on GCP/Azure since May 5

    XMLWordPrintable

Details

    • Moderate
    • Sprint 236, Sprint 237, Sprint 238, Sprint 239, Sprint 240, Sprint 241, Sprint 242
    • 7
    • No
    • Rejected
    • False
    • Hide

      None

      Show
      None

    Description

      The dns operator appears to have begun frequently spamming kube Events in some serial jobs across multiple clouds. (especially gcp and azure, aws is less common but there are some failures with the same signature)

      The pathological events test and here it appears this started on May 5th. See the Pass Rate By NURP+ Combination panel for where this is most common.

      As of the date of filing, pass rates are:
      56% - gcp, amd64, sdn, ha, serial, techpreview
      57% - gcp, amd64, sdn, ha, serial
      60% - azure, amd64, ovn, ha, serial
      60% - azure, amd64, ovn, ha, serial, techpreview

      The events seem to consistently appear as follows on all clouds:

      ns/openshift-dns service/dns-default hmsg/ade328ddf3 - pathological/true reason/TopologyAwareHintsDisabled Unable to allocate minimum required endpoints to each zone without exceeding overload threshold (5 endpoints, 3 zones), addressType: IPv4 From: 08:58:41Z To: 08:58:42Z
      

      https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.14-e2e-azure-sdn-techpreview-serial/1656207924667617280 (intervals)

      https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.14-e2e-gcp-sdn-techpreview-serial/1656207916375478272

      https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.14-e2e-aws-sdn-serial/1655277608981499904

      The Intervals item under "Debug Tools" is a great way to see these charted in time, see the "interesting events" section.

       

      test=[sig-arch] events should not repeat pathologically for namespace openshift-dns

      Attachments

        Issue Links

          Activity

            People

              cholman@redhat.com Candace Holman
              rhn-engineering-dgoodwin Devan Goodwin
              Melvin Joseph Melvin Joseph
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: