-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.12, 4.14, 4.18
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
Important
-
None
-
None
-
None
-
None
-
Done
-
Known Issue
-
-
None
-
None
-
None
-
None
Description of problem:
AMD-Vi IRQs are not affined to the smp_affinity_list after the performance profile has been applied They always use the first CPU IDs, even if trying to change manually, it does not work. [root@worker-0 ~]# ls /proc/irq/26 AMD-Vi affinity_hint effective_affinity effective_affinity_list node smp_affinity smp_affinity_list spurious [root@worker-0 ~]# cat /proc/irq/26/smp_affinity_list 88-95,184-191 [root@worker-0 ~]# echo "84-87" > /proc/irq/26/smp_affinity_list [root@worker-0 ~]# cat /proc/irq/26/smp_affinity_list 88-95,184-191
Version-Release number of selected component (if applicable):
Probably in all versions of Openshift, for now tested in OCP 4.12, 4.14, and 4.18
How reproducible:
100%
Steps to Reproduce:
1. Apply performance profile 2. After it has been applied query the affinity of the AMD-Vi IRQs 3. Try to change the smp_affinity_list as described above
Actual results:
AMD-Vi IRQs are not affined to the smp_affinity_list or reserved CPU list
Expected results:
AMD-Vi IRQs should be affined to the smp_affinity_list or reserved CPU list
Additional info:
Validated in the following hardware: System Information Manufacturer: Dell Inc. Product Name: PowerEdge R7615 Processor Information Socket Designation: CPU1 Type: Central Processor Family: Zen Manufacturer: AMD ID: 11 0F A1 00 FF FB 8B 17 Signature: Family 25, Model 17, Stepping 1 Version: AMD EPYC 9654P 96-Core Processor Core Count: 96 Core Enabled: 96 Thread Count: 192 [core@worker-0 ~]$ sudo dmesg | grep AMD-Vi [ 0.193265] AMD-Vi: Using global IVHD EFR:0x25bf732fa2295afe, EFR2:0x1d [ 1.892561] pci 0000:c0:00.2: AMD-Vi: IOMMU performance counters supported [ 1.892593] pci 0000:80:00.2: AMD-Vi: IOMMU performance counters supported [ 1.892613] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported [ 1.892639] pci 0000:40:00.2: AMD-Vi: IOMMU performance counters supported [ 1.895566] pci 0000:c0:00.2: AMD-Vi: Found IOMMU cap 0x40 [ 1.895572] AMD-Vi: Extended features (0x25bf732fa2295afe, 0x1d): PPR X2APIC NX GT [5] IA GA PC GA_vAPIC [ 1.895583] pci 0000:80:00.2: AMD-Vi: Found IOMMU cap 0x40 [ 1.895586] AMD-Vi: Extended features (0x25bf732fa2295afe, 0x1d): PPR X2APIC NX GT [5] IA GA PC GA_vAPIC [ 1.895595] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40 [ 1.895597] AMD-Vi: Extended features (0x25bf732fa2295afe, 0x1d): PPR X2APIC NX GT [5] IA GA PC GA_vAPIC [ 1.895606] pci 0000:40:00.2: AMD-Vi: Found IOMMU cap 0x40 [ 1.895608] AMD-Vi: Extended features (0x25bf732fa2295afe, 0x1d): PPR X2APIC NX GT [5] IA GA PC GA_vAPIC [ 1.895617] AMD-Vi: Interrupt remapping enabled [ 1.895619] AMD-Vi: X2APIC enabled [ 1.895637] AMD-Vi: Virtual APIC enabled
AMD-Vi IRQs always use the first CPU IDs,
$ oc get performanceprofile -o yaml | head -30 apiVersion: v1 items: - apiVersion: performance.openshift.io/v2 kind: PerformanceProfile metadata: name: blueprint-profile spec: additionalKernelArgs: - nohz_full=0-93,96-189 cpu: isolated: 0-93,96-189 reserved: 94-95,190-191 $ CPUMAX=`cat /proc/cpuinfo | grep processor | tail -n 1 | egrep -o [0-9]*$` $ echo === NAME of IRQs for every CPU === $ for C in `seq 0 $CPUMAX` ; do echo -n CPU${C}: IRQS=`grep -H ${C} /proc/irq/*/effective_affinity_list | grep :${C}$ | cut -f 4 -d '/'` for I in $IRQS ; do IRQNAME=`cat /proc/interrupts | grep \ ${I}\: | awk '{print $(NF)}'` echo -n " "${IRQNAME} done echo done === NAME of IRQs for every CPU === CPU0: timer CPU1: CPU2: AMD-Vi CPU3: AMD-Vi CPU4: AMD-Vi CPU5: AMD-Vi ...
This cluster is using the last 4 CPUs from a CCX as reserved, the idea is to leave the other CCX (11 groups of 16 CPUs) for the workloads (isolated CPUs), I believe that's why the smp_affinity_list got 88-95,184-191
[core@worker-0 ~]$ lscpu -e | grep "191" 191 0 0 95 95:95:95:11 yes [core@worker-0 ~]$ lscpu -e | grep ":11 " 88 0 0 88 88:88:88:11 yes 89 0 0 89 89:89:89:11 yes 90 0 0 90 90:90:90:11 yes 91 0 0 91 91:91:91:11 yes 92 0 0 92 92:92:92:11 yes 93 0 0 93 93:93:93:11 yes 94 0 0 94 94:94:94:11 yes 95 0 0 95 95:95:95:11 yes 184 0 0 88 88:88:88:11 yes 185 0 0 89 89:89:89:11 yes 186 0 0 90 90:90:90:11 yes 187 0 0 91 91:91:91:11 yes 188 0 0 92 92:92:92:11 yes 189 0 0 93 93:93:93:11 yes 190 0 0 94 94:94:94:11 yes 191 0 0 95 95:95:95:11 yes [core@worker-0 ~]$ cat /proc/cmdline BOOT_IMAGE=(hd0,gpt3)/ostree/rhcos-eca7b576fcf4e3884470fb6bd0b922a280a6f872c453072f58eb164102ec2261/vmlinuz-4.18.0-372.146.1.el8_6.x86_64 ignition.platform.id=metal ostree=/ostree/boot.0/rhcos/eca7b576fcf4e3884470fb6bd0b922a280a6f872c453072f58eb164102ec2261/0 root=UUID=4c4afd3c-b829-44e0-93a1-6ee8082c2472 rw rootflags=prjquota boot=UUID=5815c856-bbea-4035-b961-5f27d65c33df skew_tick=1 tsc=reliable rcupdate.rcu_normal_after_boot=1 nohz=on rcu_nocbs=0-93,96-189 tuned.non_isolcpus=c0000000,00000000,00000000,c0000000,00000000,00000000 systemd.cpu_affinity=191,190,94,95 iommu=pt isolcpus=managed_irq,0-93,96-189 nohz_full=0-93,96-189 amd_pstate=passive
4.14 Logs:
SOSReport: https://issues.redhat.com/secure/attachment/13439126/sosreport-worker-0-2025-06-17-oveuywj.tar.xz
must-gather: https://issues.redhat.com/secure/attachment/13439169/must_gather.tar.gz