-
Bug
-
Resolution: Done
-
Critical
-
4.14.z, 4.15.z
-
Quality / Stability / Reliability
-
False
-
-
2
-
Important
-
No
-
2024-08-19: Merged PRs for this change
-
None
-
None
-
None
-
T&PS 2024 #9
-
1
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
Enabling cgroup v2 is not possible if you are using performance profiles. According to the official documentation it says that Currently, disabling CPU load balancing is not supported by cgroup v2. As a result, you might not get the desired behavior from performance profiles if you have cgroup v2 enabled. Enabling cgroup v2 is not recommended if you are using performance profiles. But it should be rephrased as Enabling cgroup v2 is not possibel if you are using performance profiles. PerformanceProfile+cgroupv2 is only supported from OCP 4.16 version.
Version-Release number of selected component (if applicable):
4.15.9
How reproducible:
Followed the official documentation https://docs.openshift.com/container-platform/4.15/nodes/clusters/nodes-cluster-cgroups-2.html.
Steps to Reproduce:
1.In 4.14 onwards cgroupv2 is enabled by default.
2.But if we apply cgroupv2 with performance profile it is reverted back to cgroupV1
Without performance profile it works as expected i.e we can easily change from cgroupv1 to cgroupv2 or vice-versa
3.First I checked the node for default version of cgroup
$ oc debug node/master-2.perfprofile.lab.upshift.rdu2.redhat.com
sh-4.4# chroot /host
sh-5.1# stat -c %T -f /sys/fs/cgroup
cgroup2fs
4] create and apply performance profile. Below is the yaml
$ cat pf.yaml
apiVersion: performance.openshift.io/v2
kind: PerformanceProfile
metadata:
name: worker-hci
annotations:
kubeletconfig.experimental: |
{ "shutdownGracePeriod":"30s",
"shutdownGracePeriodCriticalPods":"10s"
}
spec:
additionalKernelArgs:
- idle=poll
- rcu_nocb_poll
- irqaffinity= 0-3
- nmi_watchdog=0
- audit=0
cpu:
isolated: 2-3
reserved: 0-1
globallyDisableIrqLoadBalancing: true
realTimeKernel:
enabled: false
numa:
topologyPolicy: "single-numa-node"
net:
userLevelNetworking: true
workloadHints:
highPowerConsumption: true
nodeSelector:
node-role.kubernetes.io/worker-hci: ""
machineConfigSelector:
machineconfiguration.openshift.io/role: worker-hci
5] After applying the performance profile . New render is generated . All the nodes are rebooted.
6] $ oc debug node/master-2.perfprofile.lab.upshift.rdu2.redhat.com
sh-4.4# chroot /host
sh-5.1# stat -c %T -f /sys/fs/cgroup
tmpfs <---- Reverted back to cgroupv1
- is duplicated by
-
OCPBUGS-24314 [enterprise-4.14] Using a performanceprofile is not just not recommended with cgroupv2, it switches the cluster back to v1.
-
- Closed
-
-
OCPBUGS-37801 Remove note stating that cgroupv2 has limitations with PerfProfiles
-
- Closed
-