Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-9057

cluster operator console/authentication shows degraded for about 6 minutes after updating ingresscontroller LB scope

    XMLWordPrintable

Details

    • Important
    • 3
    • Sprint 212, Sprint 229, Sprint 235
    • 3
    • Rejected
    • Unspecified
    • If docs needed, set a value

    Description

      Description of problem:
      The cluster operator console/authentication shows degraded for about 6 minutes after updating ingresscontroller LB scope

      OpenShift release version:
      4.10.0-0.nightly-2021-12-21-130047

      Cluster Platform:
      AWS

      How reproducible:
      100%

      Steps to Reproduce (in detail):
      1. launch a cluster on AWS
      2. change the LB scope:
      $ oc -n openshift-ingress-operator patch ingresscontrollers/default --type=merge --patch='{"spec":{"endpointPublishingStrategy":{"loadBalancer":{"scope":"Internal"}}}}'

      3. Check the message from "oc get co/ingress" and follow the instructions and delete the LB service.
      $ oc -n openshift-ingress delete svc/router-default
      service "router-default" deleted

      4. check the status of cluster operators
      $ oc get co

      Actual results:
      During the process of LB re-provision and DNS records refresh, co/console and authentication shows degraded for about 6 minutes. see:

      $ oc get co
      NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
      authentication 4.10.0-0.nightly-2021-12-21-130047 False False True 5m20s OAuthServerRouteEndpointAccessibleControllerAvailable: Get "https://oauth-openshift.apps.hongli-a22.qe.devcluster.openshift.com/healthz": dial tcp: lookup oauth-openshift.apps.hongli-a22.qe.devcluster.openshift.com on 172.30.0.10:53: no such host (this is likely result of malfunctioning DNS server)
      <--snip-->
      console 4.10.0-0.nightly-2021-12-21-130047 False False False 5m24s RouteHealthAvailable: failed to GET route (https://console-openshift-console.apps.hongli-a22.qe.devcluster.openshift.com): Get "https://console-openshift-console.apps.hongli-a22.qe.devcluster.openshift.com": dial tcp: lookup console-openshift-console.apps.hongli-a22.qe.devcluster.openshift.com on 172.30.0.10:53: no such host

          1. try more, after a while the authentication is avaible but console still shows degraded (6m6s)
            $ oc get co
            NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
            authentication 4.10.0-0.nightly-2021-12-21-130047 True False False 37s
            <--snip-->
            console 4.10.0-0.nightly-2021-12-21-130047 False False False 6m6s RouteHealthAvailable: failed to GET route (https://console-openshift-console.apps.hongli-a22.qe.devcluster.openshift.com): Get "https://console-openshift-console.apps.hongli-a22.qe.devcluster.openshift.com": dial tcp: lookup console-openshift-console.apps.hongli-a22.qe.devcluster.openshift.com on 172.30.0.10:53: no such host

      Expected results:
      using nslookup to check the DNS record from outside cluster and find it can be refreshed within about 2 minutes, so co/console and authentication should not stay in Degraded status for such a long time.

      Impact of the problem:
      unfriendly user experience

      Additional info:

        • Please do not disregard the report template; filling the template out as much as possible will allow us to help you. Please consider attaching a must-gather archive (via `oc adm must-gather`). Please review must-gather contents for sensitive information before attaching any must-gathers to a bugzilla report. You may also mark the bug private if you wish.

      Attachments

        Activity

          People

            mmasters1@redhat.com Miciah Masters
            rhn-support-hongli Hongan Li
            Hongan Li Hongan Li
            Red Hat Employee
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: