-
Task
-
Resolution: Done
-
Undefined
-
None
-
None
-
None
-
1
-
False
-
None
-
False
-
NEW
-
NEW
-
-
-
MON Sprint 257
In the context of CNF-11916 , the component sriov-network-metrics-exporter is d[1] deployed as part of the sriov-network-operator [2][3], which is a day two operator to attach special network interfaces to user workload pods.
The component adds a Prometheus endpoint that publishes metrics about user (any namespace) network traffic on these NICs. So far, the work has included ServiceMonitor instructing the default Prometheus instance to scrape this new endpoint.
The following is an example of the exported metrics
sriov_kubepoddevice{container="test",dev_type="openshift.io/intelnetdevice",namespace="cnf-4916",pciAddr="0000:17:02.4",pod="netdevice-intel-client"} 1 sriov_vf_rx_broadcast{numa_node="0",pciAddr="0000:17:02.4",pf="ens2f0",vf="12"} 155611 sriov_vf_rx_bytes{numa_node="0",pciAddr="0000:17:02.4",pf="ens2f0",vf="12"} 6.3680622e+07 sriov_vf_rx_dropped{numa_node="0",pciAddr="0000:17:02.4",pf="ens2f0",vf="12"} 0 sriov_vf_rx_multicast{numa_node="0",pciAddr="0000:17:02.4",pf="ens2f0",vf="12"} 98088 sriov_vf_rx_packets{numa_node="0",pciAddr="0000:17:02.4",pf="ens2f0",vf="12"} 342560 sriov_vf_tx_bytes{numa_node="0",pciAddr="0000:17:02.4",pf="ens2f0",vf="12"} 9.866408e+06 sriov_vf_tx_dropped{numa_node="0",pciAddr="0000:17:02.4",pf="ens2f0",vf="12"} 0 sriov_vf_tx_packets{numa_node="0",pciAddr="0000:17:02.4",pf="ens2f0",vf="12"} 144861
where `sriov_kubepoddevice` is a join metric to get tx/rx stats linked to a pod instead of a PCI device with a query like:
(rate(sriov_vf_tx_packets[1m]) * on (pciAddr) group_left(pod,namespace,dev_type) sriov_kubepoddevice)
My questions are:
a. I can query the above mentioned metrics in the Admin -> Observe -> Metrics page. How, if possible, can I add an entry to the Admin -> Observe -> Dashboards panel?
b. For the `Developer -> Observe` section, I understood metrics must have a `namespace` label to be queried. So, I'm working to add a PrometheusRule [4] to create a namespaced version
of those metrics. Is it the best way to solve this problem? Any suggestion on the value to set as `spec.groups[*].interval`?
c. What's the process to add a dashboard in the `Developer -> Observe -> Dashboards` section?
[1] https://github.com/openshift/sriov-network-metrics-exporter
[2] https://github.com/k8snetworkplumbingwg/sriov-network-operator/pull/655
[3] https://github.com/k8snetworkplumbingwg/sriov-network-operator/pull/687
[4] https://github.com/k8snetworkplumbingwg/sriov-network-operator/pull/732/files