-
Epic
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
None
-
app-sre-bot-poc
-
False
-
-
False
-
Unset
-
100% To Do, 0% In Progress, 0% Done
-
-
In Q1 we want to create a PoC of an App SRE chat bot that uses LLMs to help identify problems. It should be able to look at 3 pieces of information
- Time series data from Prometheus anomaly detection
- App SRE Change log data
- Dependency information
The idea would be to work with the LLMs to idetify outages and what may be the problem. For example can it correlate a reported outage with an anomaly in the metrics, or an outage or issue with a dependency, or with a change that went through recently.
The definition of done here is not to deliver a finished product, but to determine if this is possible, figure out how it should hang together, and demonstrate either it's feasibility or in-feasibility.
1.
|
App SRE Bot: Changelog Data Information |
|
Backlog | |
Unassigned |
2.
|
App SRE Bot: Time Series Data Anomoly Detection |
|
Backlog | |
Shruthi Raghuraman |