Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-20492

crun not respecting cpu-quota:disable (or cpu-load-balancing:disable) annotations correctly

XMLWordPrintable

    • No
    • Rejected
    • True
    • Show
      https://issues.redhat.com/browse/OCPBUGS-22665
    • Hide
      * Previously, CRI-O was not configuring the cgroup hierarchy correctly in order to account for the unique way that crun creates cgroups. As a consequence, disabling CPU quota with a PerformanceProfile didn't work. With this fix, using a PerformanceProfile disable CPU quota works as expected. (link:https://issues.redhat.com/browse/OCPBUGS-20492[*OCPBUGS-20492*])
      Show
      * Previously, CRI-O was not configuring the cgroup hierarchy correctly in order to account for the unique way that crun creates cgroups. As a consequence, disabling CPU quota with a PerformanceProfile didn't work. With this fix, using a PerformanceProfile disable CPU quota works as expected. (link: https://issues.redhat.com/browse/OCPBUGS-20492 [* OCPBUGS-20492 *])
    • Bug Fix
    • Done
    • 10/26: fcast for 4.14.1, green

      Description of problem:

      Our testpmd_DPDK application drops RX packets, even at very low line rates.  After digging in a bit deeper, it looks like when spawned via crun the process is getting cpu.cfs_quota_us = 60000 whereas with runc it is getting the expected value of cpu.cfs_quota_us = -1  - This is the same pod definition and the same OCP install, the only thing we're doing is toggling crun vs runc. 
      

      Version-Release number of selected component (if applicable):

      OpenShift 4.14.0-rc.4
      

      How reproducible:

      100% of the time
      

      Steps to Reproduce:

      1. Deploy OpenShift with crun and the telco DU PerformanceProfile and SRIOV configured
      2. Deploy the pod as specified in the attached 'dpdk-testpmd.yaml' which has 'cpu-load-balancing.crio.io: disable' and 'cpu-quota.crio.io: disable'
      3. Check the cpu.cfs_quota_us for the resulting process for example: 'cat /sys/fs/cgroup/cpu,cpuacct//kubepods.slice/kubepods-pod51723ec0_d2f3_484f_941b_f9fc53de17b7.slice/crio-e1a05a6f6d566ebc68948742cde62d591584f9befe232f85510b1bd06f35ed24.scope/[container/]cpu.cfs_quota_us'
      

      Actual results:

      60000
      

      Expected results:

      -1
      

      Additional info:

      - Switching back to runc solves the problem- quota and balancing are acting as expected there.
      - We also noticed load-balancing is not set correctly with crun
      - We're working on a simpler reproducer that doesn't require SRIOV or TestPMD
      

              pehunt@redhat.com Peter Hunt
              jramsay1@redhat.com Jim Ramsay
              Min Li Min Li
              Votes:
              0 Vote for this issue
              Watchers:
              30 Start watching this issue

                Created:
                Updated:
                Resolved: