Loading...

XML

Word

Printable

Type: Task
Resolution: Done
Priority: Major
Fix Version/s: 1.5.0.GA
Affects Version/s: None
Component/s: None
Labels:
None

Epic Link:
Operator metrics
Target Release:

1.5.0.GA

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Metrics such as operand object counts to make sure things don’t get out of control, as well as reconcile loop performance metrics and error rates. Error rates will be very useful as some changes are supported by the kubernetes API but are not supported by the strimzi operator. We should add Prometheus metrics to our operator to allow for better monitoring.

Examples of such metrics could be:

Number of clusters it is operating (e.g. 5 Kafka cluster, 3 Connect cluster etc.). Possibly this can be extended to something like "4 health Kafka cluster, one unhealthy" etc.
Number of reconciliations (to see that the operator works fine)
Number of times an operation timeouted (e.g. waiting for pod to get ready) - might indicate an error or need to increase a timeout
Maybe some lengths of the reconciliations or rolling updates

Assignee:: Unassigned

Reporter:: JAkub Scholz

Tester:: Jakub Stejskal

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2019/11/25 3:23 PM

Updated:: 2020/12/18 11:05 AM

Resolved:: 2020/04/16 7:22 AM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates