-
Epic
-
Resolution: Unresolved
-
Minor
-
None
-
Gatekeeper 3.15.0, MCE 2.6.3, ACM 2.11.3
-
None
-
Check that all containers are using terminationMessagePolicy: FallbackToLogsOnError
-
False
-
None
-
False
-
Not Selected
-
To Do
-
Low
Epic Goal
Check that all containers are using terminationMessagePolicy: FallbackToLogsOnError. There are different ways a pod can stop on an OpenShift cluster. One way is that the pod can remain alive but non-functional. Another way is that the pod can crash and become non-functional. In the first case, if the administrator has implemented liveness and readiness checks, OpenShift can stop the pod and either restart it on the same node or a different node in the cluster. For the second case, when the application in the pod stops, it should exit with a code and write suitable log entries to help the administrator diagnose what the issue was that caused the problem.
Why is this important?
This is an optional recommendation from Operator Best Practices analysis. For more info on best practices analysis see the related epic. I'd like a second opinion on the value of this recommendation for consideration in a future release.
See https://kubernetes.io/docs/tasks/debug/debug-application/determine-reason-pod-failure/ for some more details. Have we had trouble capturing why a pod failed?
Scenarios
Content in pod status example:
lastState: terminated: containerID: cri-o://3a44277dfea349874559ed3553bc4e4f8ee269ec95b6c5c978e3ed622503a4d6 exitCode: 0 finishedAt: "2024-12-19T20:10:41Z" message: | THis is a test reason: Completed startedAt: "2024-12-19T20:00:41Z"
Acceptance Criteria
...
Dependencies (internal and external)
- ...
Previous Work (Optional):
- ...
Open questions:
- …
Done Checklist
- CI - CI is running, tests are automated and merged.
- Release Enablement <link to Feature Enablement Presentation>
- DEV - Upstream code and tests merged: <link to meaningful PR or GitHub
Issue> - DEV - Upstream documentation merged: <link to meaningful PR or GitHub
Issue> - DEV - Downstream build attached to advisory: <link to errata>
- QE - Test plans in Polarion: <link or reference to Polarion>
- QE - Automated tests merged: <link or reference to automated tests>
- DOC - Doc issue opened with a completed template. Separate doc issue
opened for any deprecation, removal, or any current known
issue/troubleshooting removal from the doc, if applicable.
- is related to
-
ACM-15493 Apply required best practices to the Gatekeeper operator
-
- Review
-