Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-29067

[Custom DNS] API_URL and API_INT_URL are not resolvable

XMLWordPrintable

    • No
    • Sprint 249, Sprint 250, Sprint 251, Sprint 252, Sprint 253
    • 5
    • Proposed
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      Bootstrap process failed due to API_URL and API_INT_URL are not resolvable:
      
      Feb 06 06:41:49 yunjiang-dn16d-657jf-bootstrap systemd[1]: bootkube.service: Main process exited, code=exited, status=1/FAILURE
      Feb 06 06:41:49 yunjiang-dn16d-657jf-bootstrap systemd[1]: bootkube.service: Failed with result 'exit-code'.
      Feb 06 06:41:49 yunjiang-dn16d-657jf-bootstrap systemd[1]: bootkube.service: Consumed 1min 457ms CPU time.
      Feb 06 06:41:54 yunjiang-dn16d-657jf-bootstrap systemd[1]: bootkube.service: Scheduled restart job, restart counter is at 1.
      Feb 06 06:41:54 yunjiang-dn16d-657jf-bootstrap systemd[1]: Stopped Bootstrap a Kubernetes cluster.
      Feb 06 06:41:54 yunjiang-dn16d-657jf-bootstrap systemd[1]: bootkube.service: Consumed 1min 457ms CPU time.
      Feb 06 06:41:54 yunjiang-dn16d-657jf-bootstrap systemd[1]: Started Bootstrap a Kubernetes cluster.
      Feb 06 06:41:58 yunjiang-dn16d-657jf-bootstrap bootkube.sh[7781]: Check if API and API-Int URLs are resolvable during bootstrap
      Feb 06 06:41:58 yunjiang-dn16d-657jf-bootstrap bootkube.sh[7781]: Checking if api.yunjiang-dn16d.qe.gcp.devcluster.openshift.com of type API_URL is resolvable
      Feb 06 06:41:58 yunjiang-dn16d-657jf-bootstrap bootkube.sh[7781]: Starting stage resolve-api-url
      Feb 06 06:41:58 yunjiang-dn16d-657jf-bootstrap bootkube.sh[7781]: Unable to resolve API_URL api.yunjiang-dn16d.qe.gcp.devcluster.openshift.com
      Feb 06 06:41:58 yunjiang-dn16d-657jf-bootstrap bootkube.sh[7781]: Checking if api-int.yunjiang-dn16d.qe.gcp.devcluster.openshift.com of type API_INT_URL is resolvable
      Feb 06 06:41:58 yunjiang-dn16d-657jf-bootstrap bootkube.sh[7781]: Starting stage resolve-api-int-url
      Feb 06 06:41:58 yunjiang-dn16d-657jf-bootstrap bootkube.sh[7781]: Unable to resolve API_INT_URL api-int.yunjiang-dn16d.qe.gcp.devcluster.openshift.com
      Feb 06 06:41:58 yunjiang-dn16d-657jf-bootstrap bootkube.sh[8905]: https://localhost:2379 is healthy: successfully committed proposal: took = 7.880477ms
      Feb 06 06:41:58 yunjiang-dn16d-657jf-bootstrap bootkube.sh[7781]: Starting cluster-bootstrap...
      Feb 06 06:41:59 yunjiang-dn16d-657jf-bootstrap bootkube.sh[8989]: Starting temporary bootstrap control plane...
      Feb 06 06:41:59 yunjiang-dn16d-657jf-bootstrap bootkube.sh[8989]: Waiting up to 20m0s for the Kubernetes API
      Feb 06 06:42:00 yunjiang-dn16d-657jf-bootstrap bootkube.sh[8989]: API is up
      
      install logs:
      ...
      time="2024-02-06T06:54:28Z" level=debug msg="Unable to connect to the server: dial tcp: lookup api-int.yunjiang-dn16d.qe.gcp.devcluster.openshift.com on 169.254.169.254:53: no such host"
      time="2024-02-06T06:54:28Z" level=debug msg="Log bundle written to /var/home/core/log-bundle-20240206065419.tar.gz"
      time="2024-02-06T06:54:29Z" level=error msg="Bootstrap failed to complete: timed out waiting for the condition"
      time="2024-02-06T06:54:29Z" level=error msg="Failed to wait for bootstrapping to complete. This error usually happens when there is a problem with control plane hosts that prevents the control plane operators from creating the control plane."
      ...
      
      
          

      Version-Release number of selected component (if applicable):

      4.16.0-0.nightly-2024-02-05-184957,openshift/machine-config-operator#4165
      
          

      How reproducible:

      
Always.
          

      Steps to Reproduce:

          1. Enable custom DNS on gcp: platform.gcp.userProvisionedDNS:Enabled and featureSet:TechPreviewNoUpgrade
          2. Create cluster
          3.
          

      Actual results:

      Failed to complete bootstrap process.
          

      Expected results:

      See description.
      
          

      Additional info:

      I believe 4.15 is affected as well once https://github.com/openshift/machine-config-operator/pull/4165 backport to 4.15, currently, it failed at an early phase, see https://issues.redhat.com/browse/OCPBUGS-28969

            sdasu@redhat.com Sandhya Dasu
            yunjiang-1 Yunfei Jiang
            Yunfei Jiang Yunfei Jiang
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: