-
Spike
-
Resolution: Done
-
Undefined
-
None
-
None
-
None
-
Product / Portfolio Work
-
False
-
None
-
False
-
None
-
None
-
NetObserv - Sprint 236
The scope of our network observability initiative is set to grow beyond just netflows, this task is about reviewing some network metrics that should be interesting to integrate in our dashboards (console and/or grafana).
Just a few examples, customers are interested in seeing:
- How many routes and shards are in use in the cluster?
- What's the dns success rate?
We can think of more metrics, don't hesitate to complete this list.
In this task, we should investigate which metrics are already available in prometheus (in which case we'd just have to integrate them in our dashboards), or which ones aren't available (then we should create them).
When a metric is already available we should check which component is providing it, what is its scope (e.g. is it just openshift-sdn or ovn-k ? depends on the ingress component used in cluster?) and evaluate if we could make it more homogeneous, if possible (e.g. same metric would be generated regardless the cluster network layers in use?). Note that our main focus is still ovn-k.
When a metric isn't available, we must evaluate which component is best suited to provide it (might be an existing component, like OVS, or a new service that we create e.g. for polling cluster stats every X time).
- links to