-
Feature
-
Resolution: Unresolved
-
Normal
-
None
-
False
-
None
-
False
-
0% To Do, 0% In Progress, 100% Done
Goal
Scrape ServiceLog records and store in CCX datalake where it could be combined with incidents and other data. The expected outcome is more effective investigation of SRE incidents.
Use case from SRE-P:
We prepare a weekly report with all the incidents from on-call shifts. As an SRE Region Lead I'd like to have Service Logs available in a single query together with the Incidents to make determining the root cause easier. External Service Logs connected to the cluster usually have a description of the main issue with a cluster that needs to be addressed by the customer. Internal Service Logs might have more context from the SREs.
In scope
Service Log data available in Trino available for querying and combination with other CCX data
Not in scope
Inclusion of the data in SRE dashboards
Notes
- possible help of SRE experts with access to the ServiceLog and querying of the data there (ljakubow2.openshift, tgabriel.openshift)
Next steps
Manual exploration of the data for resolving SRE incidents. Suggestions for including it in the dashboards