Uploaded image for project: 'OpenShift Monitoring'
  1. OpenShift Monitoring
  2. MON-1564

Fix PodMetrics not accounting for the whole pod usage

XMLWordPrintable

    • False
    • False
    • NEW
    • NEW
    • Undefined
    • 0

      The PodMetrics reported by prometheus-adapter doesn't reflect the CPU and Memory usage of pods correctly.
      Currently, it is only accounting for the usages of the containers inside of pods, but this does not represent the whole pod usage as it differs from the cgroups.

      Essentially what we want is the PodMetrics values to reflect the value of the cadvisor metrics that have pod!="",container="" which account for the stats of the whole pod and then have "ContainerMetrics" for each containers. Note that this might require to add a new podQuery field to the prometheus-adapter configuration.

      Test queries exposing the problem:

      sum(container_cpu_usage_seconds_total\{pod="prometheus-k8s-0", container=""}) - sum(container_cpu_usage_seconds_total\{pod="prometheus-k8s-0", container!=""})
      
      sum(container_memory_working_set_bytes\{pod="prometheus-k8s-0", container=""}) - sum(container_memory_working_set_bytes\{pod="prometheus-k8s-0", container!=""})
      

      More info can be found in this slack thread: https://coreos.slack.com/archives/C0VMT03S5/p1615550752086600

      DoD:

        - The PodMetrics exposed by prometheus-adapter should reflect the CPU and memory usage of pods correctly.

            Unassigned Unassigned
            dgrisonn@redhat.com Damien Grisonnet
            Hongyan Li Hongyan Li
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated: