Uploaded image for project: 'OpenShift Request For Enhancement'
  1. OpenShift Request For Enhancement
  2. RFE-4824

Prometheus metrics to monitor containers' EmptyDir filesystem usage in OpenShift


    • Icon: Feature Request Feature Request
    • Resolution: Done
    • Icon: Normal Normal
    • None
    • openshift-4.12.z, openshift-4.13.z
    • Monitoring, Node

      1. Proposed title of this feature request

      Prometheus metrics to calculate containers' total filesystem usage including EmptyDir volumes

      2. What is the nature and description of the request?

      Prometheus metrics (specially container_fs_usage_bytes) don't calculate the total filesystem usage of each container/pod separately and accurately. It doesn't take into consideration the EmptyDir volumes.

      3. Why does the customer need this? (List the business requirements here)

      When a node's filesystem is exhausted, there should be a way to tell which particular container is consuming most of the available node's filesystem

      4. List any affected packages or components.

      OpenShift monitoring stack - Prometheus



      • Create a two containers pod, configure one container to mount a volume of type EmptyDir
      • Add a 5GB file to the EmptyDir mount point
      • Monitor node's, pod's and container's filesystem usage using Prometheus metrics


      # oc get pods -o wide
      NAME                      READY   STATUS    RESTARTS   AGE   IP            NODE                  NOMINATED NODE   READINESS GATES
      simple-866f479df4-bnsqw   2/2     Running   0          17m   ipi1-p6h6d-master-0   <none>           <none>
      # oc set volume pod/simple-866f479df4-bnsqw
        empty directory as empty-dir-volume
          mounted at /mnt/mydata in container container1
        unknown as kube-api-access-jwz7x
          mounted at /var/run/secrets/kubernetes.io/serviceaccount in container container1
          mounted at /var/run/secrets/kubernetes.io/serviceaccount in container container2
      # oc exec simple-866f479df4-bnsqw -c container1 -- ls -lh /mnt/mydata
      total 5.1G
      -rw-rw-rw-. 1 1000670000 1000670000 5.0G Oct 24 09:31 big-file
      # oc exec simple-866f479df4-bnsqw -c container2 -- ls -lh /mnt/mydata
      ls: cannot access /mnt/mydata: No such file or directory
      command terminated with exit code 2

      From Prometheus, run the following queries:

      sum(container_fs_usage_bytes{node = "ipi1-p6h6d-master-0"}) 

      container_fs_usage_bytes{namespace = "test", pod= "simple-866f479df4-bnsqw"} 

      There is no change in the collected values



            gausingh@redhat.com Gaurav Singh
            rhn-support-aelganzo Amr Elganzory
            0 Vote for this issue
            5 Start watching this issue