-
Task
-
Resolution: Done
-
Major
-
None
-
None
-
Product / Portfolio Work
-
1
-
False
-
-
False
-
-
-
Moderate
-
None
Outcome: at least a single TSG (Troubleshooting Guide) has been created and placed in the appropriate space
Requirement: Create document(s) providing insight into the following issues:
- Where to look to see the component's status (which pods are the most important), as well as links to any component dashboarding and how to interpret them.
- How to notice indicators of unhealthiness (what to grep for)
- What does healthy look like?
Suggestions
- Prioritize indicators of service health - where to look for, what to grep for, etc.
- Give examples - what would normal vs unhealthy look like
- If there's remediation actions that can be tried, describe them
- If there are already documented failure scenarios from ROSA HCP, provide links to those.