-
Story
-
Resolution: Done
-
Normal
-
None
-
None
-
False
-
None
-
False
-
-
We've long had a gap in our ability to retrieve logs from HCM (TAFKA SD) hive shards. The controllers there are so busy that the in-cluster pod logs roll over very frequently. The logs are saved to CloudWatch, but the process for accessing them there is so difficult that we mostly don't bother, and figure out other ways to debug. This is not ideal.
Consulting with AppSRE about this, it turns out that the logs are accessible from grafana through a much lighter process. It is still necessary to use a proprietary query language against a proprietary database schema, but it is possible to encode most of this into a dashboard that exposes search inputs that make intuitive sense to engineers.
This card is for setting up such a dashboard at a permanent, bookmarkable location. See this thread for:
- A starting point for what the dashboard should look like.
- Details about requirements.
- Links to docs describing the mechanics of creating it in app-interface.