-
Bug
-
Resolution: Done-Errata
-
Undefined
-
4.15
-
No
-
False
-
-
N/A
-
Release Note Not Required
-
In Progress
When using a disconnected image registry which is hosted at a subdomain of the cluster domain, then Agent-based Installer fails to install a OKD/FCOS cluster. The rendezvous host starts bootkube.sh but fails because it cannot resolve the registry DNS name:
Oct 25 12:47:03 master-0 bootkube.sh[6462]: error: unable to read image virthost.ostest.test.metalkube.org:5000/localimages/local-release-image@sha256:76562238a20f2f4dd45770f00730e20425edd376d30d58d7dafb5d6f02b208c5: Get "https://virthost.ostest.test.metalkube.org:5000/v2/": dial tcp: lookup virthost.ostest.test.metalkube.org: no such host Oct 25 12:47:03 master-0 systemd[1]: bootkube.service: Main process exited, code=exited, status=1/FAILURE Oct 25 12:47:03 master-0 systemd[1]: bootkube.service: Failed with result 'exit-code'.
This hit OpenShift CI jobs 'okd-e2e-agent-compact-ipv4' and 'okd-e2e-agent-sno-ipv6' based on openshift-metal3/dev-scripts. An example would be a OCP cluster domain (which contains the cluster name) of `ostest.test.metalkube.org` and a disconnected image registry at `virthost.ostest.test.metalkube.org`.
Other diagnosis from the rendezvous host:
[core@master-0 ~]$ sudo podman pull virthost.ostest.test.metalkube.org:5000/localimages/local-release-image@sha256:76562238a20f2f4dd45770f00730e20425edd376d30d58d7dafb5d6f02b208c5 Trying to pull virthost.ostest.test.metalkube.org:5000/localimages/local-release-image@sha256:76562238a20f2f4dd45770f00730e20425edd376d30d58d7dafb5d6f02b208c5... Error: initializing source docker://virthost.ostest.test.metalkube.org:5000/localimages/local-release-image@sha256:76562238a20f2f4dd45770f00730e20425edd376d30d58d7dafb5d6f02b208c5: pinging container registry virthost.ostest.test.metalkube.org:5000: Get "https://virthost.ostest.test.metalkube.org:5000/v2/": dial tcp: lookup virthost.ostest.test.metalkube.org: no such host
curl -u ocp-user:ocp-pass https://virthost.ostest.test.metalkube.org:5000/v2/_catalog curl: (6) Could not resolve host: virthost.ostest.test.metalkube.org
core@master-0 ~]$ dig +noall +answer virthost.ostest.test.metalkube.org ;; communications error to 127.0.0.1#53: connection refused ;; communications error to 127.0.0.1#53: connection refused ;; communications error to 127.0.0.1#53: connection refused virthost.ostest.test.metalkube.org. 0 IN A 192.168.111.1
After stopping systemd-resolved:
[core@master-0 ~]$ curl -u ocp-user:ocp-pass https://virthost.ostest.test.metalkube.org:5000/v2/_catalog {"repositories":["localimages/installer","localimages/local-release-image"]}
Report and diagnosis output above from afasano@redhat.com.
- blocks
-
OCPBUGS-27482 OKD: ABI is broken for OKD/FCOS when disconnected registry is a subdomain of cluster domain
- Closed
- is caused by
-
OCPBUGS-19552 OKD: Agent-based Installer is broken for HA-deployments of OKD/FCOS when api-int.* endpoint is not defined
- Closed
- is cloned by
-
OCPBUGS-27482 OKD: ABI is broken for OKD/FCOS when disconnected registry is a subdomain of cluster domain
- Closed
- links to
-
RHEA-2023:7198 rpm