-
Bug
-
Resolution: Done-Errata
-
Critical
-
4.16.0
Description of problem:
When the installer gathers a log bundle after failure (either automatically or with gather bootstrap), the installer fails to return serial console logs if an SSH connection to the bootstrap node is refused. Even if the serial console logs were collected, the installer exits on error if ssh connection is refused: time="2024-03-09T20:59:26Z" level=info msg="Pulling VM console logs" time="2024-03-09T20:59:26Z" level=debug msg="Search for matching instances by tag in us-west-1 matching aws.Filter{\"kubernetes.io/cluster/ci-op-4ygffz3q-be93e-jnn92\":\"owned\"}" time="2024-03-09T20:59:26Z" level=debug msg="Search for matching instances by tag in us-west-1 matching aws.Filter{\"openshiftClusterID\":\"2f9d8822-46fd-4fcd-9462-90c766c3d158\"}" time="2024-03-09T20:59:27Z" level=debug msg="Attemping to download console logs for ci-op-4ygffz3q-be93e-jnn92-bootstrap" Instance=i-0413f793ffabe9339 time="2024-03-09T20:59:27Z" level=debug msg="Download complete" Instance=i-0413f793ffabe9339 time="2024-03-09T20:59:27Z" level=debug msg="Attemping to download console logs for ci-op-4ygffz3q-be93e-jnn92-master-0" Instance=i-0ab5f920818366bb8 time="2024-03-09T20:59:27Z" level=debug msg="Download complete" Instance=i-0ab5f920818366bb8 time="2024-03-09T20:59:27Z" level=debug msg="Attemping to download console logs for ci-op-4ygffz3q-be93e-jnn92-master-2" Instance=i-0b93963476818535d time="2024-03-09T20:59:27Z" level=debug msg="Download complete" Instance=i-0b93963476818535d time="2024-03-09T20:59:28Z" level=debug msg="Attemping to download console logs for ci-op-4ygffz3q-be93e-jnn92-master-1" Instance=i-0797728e092bfbeef time="2024-03-09T20:59:28Z" level=debug msg="Download complete" Instance=i-0797728e092bfbeef time="2024-03-09T20:59:28Z" level=info msg="Pulling debug logs from the bootstrap machine" time="2024-03-09T20:59:28Z" level=debug msg="Added /tmp/bootstrap-ssh3643557583 to installer's internal agent" time="2024-03-09T20:59:28Z" level=debug msg="Added /tmp/.ssh/ssh-privatekey to installer's internal agent" time="2024-03-09T21:01:39Z" level=error msg="Attempted to gather debug logs after installation failure: failed to connect to the bootstrap machine: dial tcp 13.57.212.80:22: connect: connection timed out" from: https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_api/1788/pull-ci-openshift-api-master-e2e-aws-ovn/1766560949898055680 We can see the console logs were downloaded, they should be saved in the log bundle.
Version-Release number of selected component (if applicable):
How reproducible:
Steps to Reproduce:
1. Failed install where SSH to bootstrap node fails. https://github.com/openshift/installer/pull/8137 provides a potential reproducer 2. 3.
Actual results:
Expected results:
Additional info:
Error handling needs to be reworked here: https://github.com/openshift/installer/blob/master/cmd/openshift-install/gather.go#L160-L190
- blocks
-
OCPBUGS-32264 installer log bundle should gather console logs even when ssh fails
- Closed
- is cloned by
-
OCPBUGS-32264 installer log bundle should gather console logs even when ssh fails
- Closed
- is depended on by
-
OCPBUGS-32174 Serial Logs not returned when ssh issues occur
- Closed
- links to
-
RHEA-2024:0041 OpenShift Container Platform 4.16.z bug fix update