Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-28926

[4.15] Random numbers in pids.max file on pods as well as on nodes.

XMLWordPrintable

    • No
    • OCPNODE Sprint 249 (Green), OCPNODE Sprint 250 (Green), OCPNODE Sprint 251 (Green), OCPNODE Sprint 252 (Green)
    • 4
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Hide
      Fix a bug where containers would have a skewed view of the pids limit in their cgroup hierarchy. It would report as a seemingly random number, instead of `max`. Note: the containers do not have max pids, they are limited by the pod pid limit, which is set outside of the container's cgroup hierarchy, and thus not visible from within the container.
      Show
      Fix a bug where containers would have a skewed view of the pids limit in their cgroup hierarchy. It would report as a seemingly random number, instead of `max`. Note: the containers do not have max pids, they are limited by the pod pid limit, which is set outside of the container's cgroup hierarchy, and thus not visible from within the container.
    • Bug Fix
    • In Progress
    • Customer Escalated

      Description of problem:
      In 4.12 OCP cluster the default podPidsLimit is 4096 when checked at node level in /sys/fs/cgroup/pids/kubepods.slice/kubepods-//pids.max path.

      for f in /sys/fs/cgroup/pids/kubepods.slice/kubepods-*/*/pids.max; do echo $f.; cat $f; done
      /sys/fs/cgroup/pids/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podb910bab7_528b_48c1_a0d5_72493eea2e0d.slice/pids.max.
      4096
      /sys/fs/cgroup/pids/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod08371dcf_fcf7_49e7_84ef_d3887fcc7694.slice/pids.max.
      4096
      /sys/fs/cgroup/pids/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod10795289d34b5e76d3845007b0111048.slice/pids.max.
      4096
      /sys/fs/cgroup/pids/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod188fca37_2aef_4668_8f5d_c2a390e86cc6.slice/pids.max.
      4096
      /sys/fs/cgroup/pids/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod2a47340c_46ef_41aa_94ce_10bc726ab328.slice/pids.max.
      4096
      /sys/fs/cgroup/pids/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod3c0e1a64_46bc_41f9_9adf_b73157d7ae86.slice/pids.max.
      4096
      /sys/fs/cgroup/pids/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod3e77ad6b_6fdc_4936_8054_14be023f26d8.slice/pids.max.
      4096
      

      However , when we check inside any pod or node we could see a pids.max file with a random number . See section [A] and [B]

      • *[A] * pids.max inside container(having random number)
      [root@mirrorreg1 ~]# oc rsh <podname>
      sh-4.4$ cat /sys/fs/cgroup/pids/pids.max
      1288812
      
      • [B] pids.max inside node(having random number)
      cat /sys/fs/cgroup/pids/kubepods.slice/pids.max 
      127385
      

      Can someone please help me to understand :

      • why we have three pids.max value on the cluster and which one should we consider for pod pid limit?
      • If the default podPidsLimit is 4096 why we see two other pids.max file with a random number inside pod?
      Version-Release number of selected component (if applicable):{code:none}
      
      

      How reproducible:

      On node:
      

      1. Login to ocp node
      2. check path /sys/fs/cgroup/pids/kubepods.slice/pids.max

      
      On pod 
      
      

      1. Login to any pod
      2. Check path cat /sys/fs/cgroup/pids/pids.max

      Actual results:{code:none}
      
      

      Expected results:

      Pod pids limit can only be seen /sys/fs/cgroup/pids/kubepods.slice/kubepods-*/*/pids.max
      

      Additional info:

      This behavior can be seen in any OCP cluster. Do let me know if you need any logs.
      

              pehunt@redhat.com Peter Hunt
              rhn-support-psingour Poornima Singour
              Min Li Min Li
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated:
                Resolved: