Uploaded image for project: 'OpenShift Monitoring'
  1. OpenShift Monitoring
  2. MON-1564

Fix PodMetrics not accounting for the whole pod usage

XMLWordPrintable

    • False
    • False
    • NEW
    • NEW
    • Undefined

      The PodMetrics reported by prometheus-adapter doesn't reflect the CPU and Memory usage of pods correctly.
      Currently, it is only accounting for the usages of the containers inside of pods, but this does not represent the whole pod usage as it differs from the cgroups.

      Essentially what we want is the PodMetrics values to reflect the value of the cadvisor metrics that have pod!="",container="" which account for the stats of the whole pod and then have "ContainerMetrics" for each containers. Note that this might require to add a new podQuery field to the prometheus-adapter configuration.

      Test queries exposing the problem:

      sum(container_cpu_usage_seconds_total\{pod="prometheus-k8s-0", container=""}) - sum(container_cpu_usage_seconds_total\{pod="prometheus-k8s-0", container!=""})
      
      sum(container_memory_working_set_bytes\{pod="prometheus-k8s-0", container=""}) - sum(container_memory_working_set_bytes\{pod="prometheus-k8s-0", container!=""})
      

      More info can be found in this slack thread: https://coreos.slack.com/archives/C0VMT03S5/p1615550752086600

      DoD:

        - The PodMetrics exposed by prometheus-adapter should reflect the CPU and memory usage of pods correctly.

              Unassigned Unassigned
              dgrisonn@redhat.com Damien Grisonnet
              Hongyan Li Hongyan Li
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated: