Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Critical
Fix Version/s: 4.14.z
Affects Version/s: 4.14.z, 4.15.z
Component/s: Documentation / CNF
Labels:
- TPSDocs:Triaged

Severity:
Important
Regression:
No
Story Points:
2
Sprint:
T&PS 2024 #9
sprint_count:
1
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Latest Status Summary:
2024-08-19: Merged PRs for this change
RH Private Keywords:

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:
PX Priority Data:

Description of problem:

Enabling cgroup v2 is not possible if you are using performance profiles. According to the official documentation it says that Currently, disabling CPU load balancing is not supported by cgroup v2. As a result, you might not get the desired behavior from performance profiles if you have cgroup v2 enabled. Enabling cgroup v2 is not recommended if you are using performance profiles.


But it should be rephrased as  Enabling cgroup v2 is not possibel if you are using performance profiles.
PerformanceProfile+cgroupv2 is only supported from OCP 4.16  version.

Version-Release number of selected component (if applicable):

4.15.9

How reproducible:

Followed the official documentation https://docs.openshift.com/container-platform/4.15/nodes/clusters/nodes-cluster-cgroups-2.html.

Steps to Reproduce:

1.In 4.14 onwards cgroupv2 is enabled by default.
2.But if we apply cgroupv2 with performance profile it is reverted back to cgroupV1 
  Without performance profile it works as expected i.e we can easily change from cgroupv1 to cgroupv2 or vice-versa 
3.First I checked the node for default version of cgroup

$ oc debug node/master-2.perfprofile.lab.upshift.rdu2.redhat.com
sh-4.4# chroot /host
sh-5.1# stat -c %T -f /sys/fs/cgroup
cgroup2fs

4] create and apply performance profile. Below is the yaml
$ cat pf.yaml 
apiVersion: performance.openshift.io/v2
kind: PerformanceProfile
metadata:
  name: worker-hci
  annotations:
    kubeletconfig.experimental: |
      { "shutdownGracePeriod":"30s",
        "shutdownGracePeriodCriticalPods":"10s"
      }
spec:
  additionalKernelArgs:
    - idle=poll
    - rcu_nocb_poll
    - irqaffinity= 0-3
    - nmi_watchdog=0
    - audit=0
  cpu:
    isolated: 2-3
    reserved: 0-1
  globallyDisableIrqLoadBalancing: true
  realTimeKernel:
    enabled: false
  numa:
    topologyPolicy: "single-numa-node"
  net:
    userLevelNetworking: true
  workloadHints:
    highPowerConsumption: true
  nodeSelector:
    node-role.kubernetes.io/worker-hci: ""
  machineConfigSelector:
    machineconfiguration.openshift.io/role: worker-hci


5] After applying the performance profile . New render is generated . All the nodes are rebooted.
6] $ oc debug node/master-2.perfprofile.lab.upshift.rdu2.redhat.com
   sh-4.4# chroot /host 
   sh-5.1# stat -c %T -f /sys/fs/cgroup 
   tmpfs   <---- Reverted back to cgroupv1

is duplicated by

OCPBUGS-24314 [enterprise-4.14] Using a performanceprofile is not just not recommended with cgroupv2, it switches the cluster back to v1.

Closed

OCPBUGS-37801 Remove note stating that cgroupv2 has limitations with PerfProfiles

Closed

Assignee:: Kevin Quinn

Reporter:: SAYED AMRIN HANIF

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2024/06/13 10:44 AM

Updated:: 2024/12/18 10:29 PM

Resolved:: 2024/08/19 12:04 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates