Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-2178

Cluster install failed, machine-api provisioning workers before dns operator starts

XMLWordPrintable

    • Moderate
    • 1
    • Sprint 238
    • 1
    • Rejected
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      A OSD cluster installation on the GCP cloud provider failed to provision worker machines successfully.

      Analysis of the machine-api-controller logs showed that it was unable to resolve oauth2.googleapis.com. This occurred at 2022-10-10 23:56

      The dns-operator logs show that it did not begin to install until the same time, and was in the process of installing whilst the machine-api-controller was failing to DNS resolve the GCP API.

      The machine-api-controller gave up on provisioning the failed workers before the DNS operator seemed to become ready to allow it to resolve DNS.

      I was able to later confirm that DNS resolution was fine in the machine-api-controller after the DNS operator installed, by deleting the machine that failed to provision, and confirming that a replacement machine could successfully provision.

      Version-Release number of selected component (if applicable):

      4.11.7

      Expected results:

      Cluster DNS should be in an installed state before machine-api pods should be attempting to resolve external DNS addresses.

       

            mmasters1@redhat.com Miciah Masters
            mbargenq Matt Bargenquast (Inactive)
            Hongan Li Hongan Li
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: