-
Bug
-
Resolution: Done-Errata
-
Normal
-
None
-
4.15.0
-
None
-
No
-
False
-
-
-
Bug Fix
This is a clone of issue OCPBUGS-20037. The following is the description of the original issue:
—
Description of problem:
The greenboot health check script produces logs to for the journal under the greenboot-healthcheck unit. Some checks are performed using background processes, which print their output to stdout/stderr but are not picked up by journald as if they belong to the unit. This results in lost entries when executing journalctl -u greenboot-healthcheck.
Version-Release number of selected component (if applicable):
4.14
How reproducible:
100%
Steps to Reproduce:
1. Boot microshift host. 2. Trigger a failed greenboot health check for microshift. 3. Check the journal output for greenboot healthcheck service and see a failure with empty reasons/files.
Actual results:
Oct 01 17:06:51 edgeniusos01 40_microshift_running_check.sh[1424]: Waiting 300s for 2 pod(s) from the 'kube-system' namespace to be in 'Ready' state Oct 01 17:06:57 edgeniusos01 40_microshift_running_check.sh[1424]: ====== Oct 01 17:06:57 edgeniusos01 40_microshift_running_check.sh[1424]: Info: Log file '/var/lib/microshift-backups/prerun_failed.log' does not exist Oct 01 17:06:57 edgeniusos01 40_microshift_running_check.sh[1424]: ====== Oct 01 17:06:57 edgeniusos01 40_microshift_running_check.sh[1424]: Failure log in: '/tmp/pod-list.8OaJQ8I3Z8' Oct 01 17:06:57 edgeniusos01 40_microshift_running_check.sh[1424]: ------ Oct 01 17:06:57 edgeniusos01 40_microshift_running_check.sh[1424]: ------ Oct 01 17:06:57 edgeniusos01 40_microshift_running_check.sh[1424]: ====== Oct 01 17:06:57 edgeniusos01 40_microshift_running_check.sh[1424]: Failure log in: '/tmp/pod-events.L9b8QYjl9e' Oct 01 17:06:57 edgeniusos01 40_microshift_running_check.sh[1424]: ------ Oct 01 17:06:57 edgeniusos01 40_microshift_running_check.sh[1424]: ------ Oct 01 17:06:57 edgeniusos01 40_microshift_running_check.sh[1424]: FAILURE Oct 01 17:06:57 edgeniusos01 greenboot[1315]: Script '40_microshift_running_check.sh' FAILURE (exit code '1'). Continuing...
Expected results:
Oct 01 17:06:57 edgeniusos01 40_microshift_running_check.sh[5635]: The number of ready pods in the 'kube-system' namespace is greater than the expected '2' count. Terminating... Oct 01 17:06:57 edgeniusos01 40_microshift_running_check.sh[1424]: ====== Oct 01 17:06:57 edgeniusos01 40_microshift_running_check.sh[1424]: Info: Log file '/var/lib/microshift-backups/prerun_failed.log' does not exist Oct 01 17:06:57 edgeniusos01 40_microshift_running_check.sh[1424]: ====== Oct 01 17:06:57 edgeniusos01 40_microshift_running_check.sh[1424]: Failure log in: '/tmp/pod-list.8OaJQ8I3Z8' Oct 01 17:06:57 edgeniusos01 40_microshift_running_check.sh[1424]: ------ Oct 01 17:06:57 edgeniusos01 40_microshift_running_check.sh[5651]: NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES Oct 01 17:06:57 edgeniusos01 40_microshift_running_check.sh[5651]: cert-manager cert-manager-75d57c8d4b-6vdwh 1/1 Running 3 8h 10.42.0.8 edgeniusos01 <none> <none> ... Oct 01 17:06:57 edgeniusos01 40_microshift_running_check.sh[1424]: Failure log in: '/tmp/pod-events.L9b8QYjl9e' Oct 01 17:06:57 edgeniusos01 40_microshift_running_check.sh[1424]: ------ Oct 01 17:06:57 edgeniusos01 40_microshift_running_check.sh[5652]: NAMESPACE LAST SEEN TYPE REASON OBJECT MESSAGE Oct 01 17:06:57 edgeniusos01 40_microshift_running_check.sh[5652]: cert-manager 5s Warning NodeNotReady pod/cert-manager-75d57c8d4b-6vdwh Node is not ready ... Oct 01 17:06:57 edgeniusos01 40_microshift_running_check.sh[1424]: ------ Oct 01 17:06:57 edgeniusos01 40_microshift_running_check.sh[1424]: FAILURE Oct 01 17:06:57 edgeniusos01 greenboot[1315]: Script '40_microshift_running_check.sh' FAILURE (exit code '1'). Continuing...
Additional info:
- clones
-
OCPBUGS-20037 Greenboot health check logs do not belong to the unit
- Closed
- is blocked by
-
OCPBUGS-20037 Greenboot health check logs do not belong to the unit
- Closed
- links to
-
RHSA-2023:5008 OpenShift Container Platform 4.14.z security update