Uploaded image for project: 'OpenShift Monitoring'
  1. OpenShift Monitoring
  2. MON-3973

[consultancy] Adding SRIOV metrics in OpenShift Console

XMLWordPrintable

    • Icon: Task Task
    • Resolution: Done
    • Icon: Undefined Undefined
    • None
    • None
    • None
    • 1
    • False
    • None
    • False
    • NEW
    • NEW
    • MON Sprint 257

      In the context of CNF-11916 , the component sriov-network-metrics-exporter is d[1] deployed as part of the sriov-network-operator [2][3], which is a day two operator to attach special network interfaces to user workload pods.
      The component adds a Prometheus endpoint that publishes metrics about user (any namespace) network traffic on these NICs. So far, the work has included ServiceMonitor instructing the default Prometheus instance to scrape this new endpoint.
       
      The following is an example of the exported metrics

      sriov_kubepoddevice{container="test",dev_type="openshift.io/intelnetdevice",namespace="cnf-4916",pciAddr="0000:17:02.4",pod="netdevice-intel-client"} 1
      sriov_vf_rx_broadcast{numa_node="0",pciAddr="0000:17:02.4",pf="ens2f0",vf="12"} 155611
      sriov_vf_rx_bytes{numa_node="0",pciAddr="0000:17:02.4",pf="ens2f0",vf="12"} 6.3680622e+07
      sriov_vf_rx_dropped{numa_node="0",pciAddr="0000:17:02.4",pf="ens2f0",vf="12"} 0
      sriov_vf_rx_multicast{numa_node="0",pciAddr="0000:17:02.4",pf="ens2f0",vf="12"} 98088
      sriov_vf_rx_packets{numa_node="0",pciAddr="0000:17:02.4",pf="ens2f0",vf="12"} 342560
      sriov_vf_tx_bytes{numa_node="0",pciAddr="0000:17:02.4",pf="ens2f0",vf="12"} 9.866408e+06
      sriov_vf_tx_dropped{numa_node="0",pciAddr="0000:17:02.4",pf="ens2f0",vf="12"} 0
      sriov_vf_tx_packets{numa_node="0",pciAddr="0000:17:02.4",pf="ens2f0",vf="12"} 144861
      

      where `sriov_kubepoddevice` is a join metric to get tx/rx stats linked to a pod instead of a PCI device with a query like:

      (rate(sriov_vf_tx_packets[1m]) * on (pciAddr)  group_left(pod,namespace,dev_type)  sriov_kubepoddevice)
      

      My questions are:
      a. I can query the above mentioned metrics in the Admin -> Observe -> Metrics page. How, if possible, can I add an entry to the Admin -> Observe -> Dashboards panel?
      b. For the `Developer -> Observe` section, I understood metrics must have a `namespace` label to be queried. So, I'm working to add a PrometheusRule [4] to create a namespaced version
      of those metrics. Is it the best way to solve this problem? Any suggestion on the value to set as `spec.groups[*].interval`?
      c. What's the process to add a dashboard in the `Developer -> Observe -> Dashboards` section?
       

      [1] https://github.com/openshift/sriov-network-metrics-exporter
      [2] https://github.com/k8snetworkplumbingwg/sriov-network-operator/pull/655
      [3] https://github.com/k8snetworkplumbingwg/sriov-network-operator/pull/687
      [4] https://github.com/k8snetworkplumbingwg/sriov-network-operator/pull/732/files

              spasquie@redhat.com Simon Pasquier
              apanatto@redhat.com Andrea Panattoni
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: