-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.19.0
-
None
Description of problem:
Tuned doesn't honor the device exclusion using devices_udev_regex=^INTERFACE=(?!enp41s0) . This causes the device's netqueues to be modified to be equal to reserved cpu count. This is reproducible on mellanox cards mostly and not seen intel cards.
Version-Release number of selected component (if applicable):
4.19.0-ec.2
How reproducible:
Everytime on worker nodes with mellanox cards (mlx5_core driver)
Steps to Reproduce:
1. Configure Performance profile to exclude one of the devices like shown below:
apiVersion: performance.openshift.io/v2
kind: PerformanceProfile
metadata:
name: performance
spec:
cpu:
isolated: 1,3-39,41,43-79
reserved: 0,2,40,42
hugepages:
defaultHugepagesSize: 1G
pages:
- count: 1
node: 0
size: 1G
- count: 128
node: 1
size: 2M
kernelPageSize: 4k
machineConfigPoolSelector:
machineconfiguration.openshift.io/role: worker-cnf
net:
devices:
- interfaceName: '!ens8f0np0'
userLevelNetworking: true
nodeSelector:
node-role.kubernetes.io/worker-cnf: ""
numa:
topologyPolicy: single-numa-node
realTimeKernel:
enabled: false
workloadHints:
highPowerConsumption: true
perPodPowerManagement: false
realTime: true
2. Apply the performance Profile
3. Verify the the device combined channel using ethtool
Actual results:
sh-5.1# ethtool -l ens8f0np0
Channel parameters for ens8f0np0:
Pre-set maximums:
RX: n/a
TX: n/a
Other: n/a
Combined: 63
Current hardware settings:
RX: n/a
TX: n/a
Other: n/a
Combined: 4
Expected results:
sh-5.1# ethtool -l ens8f0np0
Channel parameters for ens8f0np0:
Pre-set maximums:
RX: n/a
TX: n/a
Other: n/a
Combined: 63
Current hardware settings:
RX: n/a
TX: n/a
Other: n/a
Combined: 63
Additional info:
sh-5.1# cat tuned.conf | grep -v ^$ | grep -v ^#
[main]
summary=Openshift node optimized for deterministic performance at the cost of increased power consumption, focused on low latency network performance. Based on Tuned 2.11 and Cluster node tuning (oc 4.5)
include=openshift-node,cpu-partitioning${f:regex_search_ternary:${f:exec:uname:-r}:rt:,openshift-node-performance-rt-performance:};
openshift-node-performance-${f:lscpu_check:Vendor ID\:\s*GenuineIntel:intel:Vendor ID\:\s*AuthenticAMD:amd:Vendor ID\:\s*ARM:arm}-${f:lscpu_check:Architecture\:\s*x86_64:x86:Architecture\:\s*aarch64:aarch64}-performance
[variables]
isolated_cores=1,3-39,41,43-79
not_isolated_cores_expanded=${f:cpulist_invert:${isolated_cores_expanded}}
[cpu]
force_latency=cstate.id:1|3
governor=performance
energy_perf_bias=performance
min_perf_pct=100
[service]
service.stalld=start,enable
[vm]
transparent_hugepages=never
[irqbalance]
enabled=false
[scheduler]
runtime=0
group.ksoftirqd=0:f:11:*:ksoftirqd.*
group.rcuc=0:f:11:*:rcuc.*
group.ktimers=0:f:11:*:ktimers.*
default_irq_smp_affinity = ignore
irq_process=false
[sysctl]
kernel.hung_task_timeout_secs=600
kernel.nmi_watchdog=0
kernel.sched_rt_runtime_us=-1
vm.stat_interval=10
kernel.timer_migration=1
net.ipv4.tcp_fastopen=3
vm.dirty_ratio=10
vm.dirty_background_ratio=3
vm.swappiness=10
[selinux]
avc_cache_threshold=8192
[net]
type=net
devices_udev_regex=^INTERFACE=(?!ens8f0np0)
channels=combined 4
nf_conntrack_hashsize=131072
[bootloader]
initrd_remove_dir=
initrd_dst_img=
initrd_add_dir=
cmdline_cpu_part=+nohz=on rcu_nocbs=${isolated_cores} tuned.non_isolcpus=${not_isolated_cpumask} systemd.cpu_affinity=${not_isolated_cores_expanded}
cmdline_iommu=
cmdline_isolation=+isolcpus=managed_irq,${isolated_cores}
cmdline_realtime_nohzfull=+nohz_full=${isolated_cores}
cmdline_realtime_nosoftlookup=+nosoftlockup
cmdline_realtime_common=+skew_tick=1 rcutree.kthread_prio=11
cmdline_power_performance=
cmdline_idle_poll=
cmdline_hugepages=+ default_hugepagesz=1G hugepagesz=2M hugepages=0
[rtentsk]