Uploaded image for project: 'OpenShift Data Foundation Request For Enhancement'
  1. OpenShift Data Foundation Request For Enhancement
  2. ODFRFE-42

[RFE][GSS] Add Prometheus metrics about IO/s bandwith and IO latency on each PV and object bucket backed by ODF

XMLWordPrintable

    • Icon: Feature Request Feature Request
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • odf-4.16.z
    • None
    • False
    • Hide

      None

      Show
      None
    • False

      Say for example you have a cluster with a workload of 500 clients on 300 cephfs volumes. Using the below promql quey an observed total bandwidth consumption of 500 Mb/s is returned.

      sum by (name) (rate(ceph_pool_wr_bytes[5m]) * on (pool_id) group_left(name) ceph_pool_metadata{name=~"ocs-storagecluster-cephfilesystem-data0"})

      Before 4.16, there is no way to know; the per-client metricsceph_mds_client_metrics_ocs_storagecluster* do not exist. Since 4.16, per-client metrics are available: you have the consumption per client, but you have no way of knowing which pods are hiding behind each clientID.

      Requested solution:
      Add a metric that provides this information. for example:

      ceph_mds_client_info{client="client.8675309", ip_addr="10.10.10.10", subvol="csi-vol-0db1a72c-6589-4e98-877b-e602a872b7f4", hostname="ahost.afundomain.com", ...}

              Unassigned Unassigned
              rhn-support-kelwhite Kelson White
              Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated: