-
Bug
-
Resolution: Done-Errata
-
Minor
-
None
-
4.16
Description of problem: When the bootstrap times out, the installer tries to download the logs from the bootstrap VM and gives an analysis of what happened. On OpenStack platform, we're currently failing to download the bootstrap logs (tracked in OCPBUGS-34950), which causes the analysis to always return an erroneous message:
time="2024-06-05T08:34:45-04:00" level=error msg="Bootstrap failed to complete: timed out waiting for the condition" time="2024-06-05T08:34:45-04:00" level=error msg="Failed to wait for bootstrapping to complete. This error usually happens when there is a problem with control plane hosts that prevents the control plane operators from creating the control plane." time="2024-06-05T08:34:45-04:00" level=error msg="The bootstrap machine did not execute the release-image.service systemd unit"
The affirmation that the bootstrap machine did not execute the release-image.service systemd unit is wrong, as I can confirm by SSH'ing to the bootstrap node:
systemctl status release-image.service ● release-image.service - Download the OpenShift Release Image Loaded: loaded (/etc/systemd/system/release-image.service; static) Active: active (exited) since Wed 2024-06-05 11:57:33 UTC; 1h 16min ago Process: 2159 ExecStart=/usr/local/bin/release-image-download.sh (code=exited, status=0/SUCCESS) Main PID: 2159 (code=exited, status=0/SUCCESS) CPU: 47.364s Jun 05 11:57:05 mandre-tnvc8bootstrap systemd[1]: Starting Download the OpenShift Release Image... Jun 05 11:57:06 mandre-tnvc8bootstrap podman[2184]: 2024-06-05 11:57:06.895418265 +0000 UTC m=+0.811028632 system refresh Jun 05 11:57:06 mandre-tnvc8bootstrap release-image-download.sh[2159]: Pulling quay.io/openshift-release-dev/ocp-release@sha256:31cdf34b1957996d5c79c48466abab2fcfb9d9843> Jun 05 11:57:32 mandre-tnvc8bootstrap release-image-download.sh[2269]: 079f5c86b015ddaf9c41349ba292d7a5487be91dd48e48852d10e64dd0ec125d Jun 05 11:57:32 mandre-tnvc8bootstrap podman[2269]: 2024-06-05 11:57:32.82473216 +0000 UTC m=+25.848290388 image pull 079f5c86b015ddaf9c41349ba292d7a5487be91dd48e48852d1> Jun 05 11:57:33 mandre-tnvc8bootstrap systemd[1]: Finished Download the OpenShift Release Image.
The installer was just unable to retrieve the bootstrap logs. Earlier, buried in the installer logs, we can see:
time="2024-06-05T08:34:42-04:00" level=info msg="Failed to gather bootstrap logs: failed to connect to the bootstrap machine: dial tcp 10.196.2.10:22: connect: connection
timed out"
This is what should be reported by the analyzer.
- blocks
-
OCPBUGS-37838 Log bundle analizer gives bogus analyze when failing to download bootstrap logs
- Closed
- is cloned by
-
OCPBUGS-37838 Log bundle analizer gives bogus analyze when failing to download bootstrap logs
- Closed
- links to
-
RHEA-2024:3718 OpenShift Container Platform 4.17.z bug fix update