-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
4.12
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
No
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
Seeing high cyclictest results on SNO 4.12 installation
Version-Release number of selected component (if applicable):
How reproducible:
Reproducible
Steps to Reproduce:
1. Apply performance profile:
oc apply -f - << EOF
apiVersion: tuned.openshift.io/v1
kind: Tuned
metadata:
name: openshift-node-custom
namespace: openshift-cluster-node-tuning-operator
spec:
profile:
- data: |
[main]
summary=Custom OpenShift node profile with an additional kernel parameters [bootloader]
cmdline_openshift_node_custom=+intel_iommu=on iommu=pt noefi vfio_pci.enable_sriov=1 vfio_pci.disable_idle_d3=1 usbcore.autosuspend=-1 enforcing=0 nmi_watchdog=0 crashkernel=auto softlockup_panic=0 audit=0 mce=off hugepagesz=1G hugepages=32 hugepagesz=2M hugepages=0 default_hugepagesz=1G kthread_cpus=0,32 irqaffinity=0,32 nohz=on rcu_nocb_poll skew_tick=1 isolcpus=managed_irq,domain,1-31,33-63 nohz_full=1-31,33-63 rcu_nocbs=1-31,33-63 nosoftlockup
[irqbalance]
banned_cpus=1-31,33-63
[scheduler]
isolated_cores=1-31,33-63
[rtentsk]
[cpu]
force_latency=-1
name: openshift-node-custom
recommend:
- machineConfigLabels:
machineconfiguration.openshift.io/role: master
priority: 0
profile: openshift-node-custom
---
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: "master"
name: 99-master-realtime
spec:
kernelType: realtime
EOF
2. Patch image registry operator
oc patch configs.imageregistry.operator.openshift.io cluster --type merge --patch '{"spec":{"storage":{"emptyDir":{}}}}')
oc patch configs.imageregistry.operator.openshift.io cluster --type merge --patch '{"spec":{"managementState":"Managed"}}'
3. Create container image to run cyclictest
oc apply -f - << EOF
apiVersion: image.openshift.io/v1
kind: ImageStream
metadata:
name: cyclictest-test
namespace: default
spec:
tags:
- name: latest
---
apiVersion: build.openshift.io/v1
kind: BuildConfig
metadata:
name: cyclictest-test-build-config
namespace: default
spec:
output:
to:
kind: ImageStreamTag
name: cyclictest-test:latest
source:
dockerfile: |
FROM quay.io/centos/centos:stream8
RUN dnf upgrade -y --refresh && dnf install -y rt-tests python3 kernel-tools
strategy:
type: Docker
EOF
oc start-build cyclictest-test-build-config
4. Create cyclictest pod
oc apply -f - << EOF
apiVersion: v1
kind: Pod
metadata:
name: cyclictest-test
namespace: default
spec:
nodeName: node0
containers:
- name: cyclictest-test
image: $(oc get images | grep cyclictest-test | awk '{ print $2 }')
command: ["/bin/bash"]
tty: true
securityContext:
privileged: true
capabilities:
add:
- IPC_LOCK
- SYS_NICE
- SYS_ADMIN
EOF
5. Run cyclictest
oc rsh pods/cyclictest-test taskset -c 0-16 cyclictest -m -p95 -h 15 -a 1-16 -t 16 --mainaffinity=0
Actual results:
All threads show max latency <= 10us (most around 10-12us, but some are as high as 35us)
Expected results:
All threads should have max latency < 10us after running cyclictest for an extended period of time (6-12hrs)
Additional info:
This was tested on the same platform with Ubuntu RT and all threads max latencies were < 10us