-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
4.14.z
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
Important
-
No
-
-
None
-
None
-
None
-
CNF RAN Sprint 277, CNF RAN Sprint 278, CNF RAN Sprint 279, CNF RAN Sprint 280
-
4
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
On SNO spoke with telco DU profile applied, oslat reported 45us latency spike on a 1h run
Version-Release number of selected component (if applicable):
4.14.20 local-storage-operator.v4.14.0-202403261739 cluster-logging.v5.9.0 packageserver ptp-operator.v4.14.0-202403222237 sriov-network-operator.v4.14.0-202402270139 sriov-fec.v2.8.0
How reproducible:
always
Steps to Reproduce:
1. Deploy DU node
2. Run OSLAT test pod
[INFO] oslat git hash: ea82509d664d72992068c3a1fc41f9a66e2c3f99
[INFO] oslat image sha: sha256:4b568365d42fd6198aafa6d7ac61a2a6dc842521acb739f05647d5f9b36cca40
[INFO] Pod spec
apiVersion: v1
kind: Pod
metadata:
name: oslat0
annotations:
# Disable CPU balance with CRIO
irq-load-balancing.crio.io: "disable"
cpu-load-balancing.crio.io: "disable"
cpu-quota.crio.io: "disable"
labels:
app: oslat
spec:
restartPolicy: Never
runtimeClassName: performance-openshift-node-performance-profile
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- oslat
topologyKey: "kubernetes.io/hostname"
containers:
- args:
name: container-perf-tools
image: registry.kni-qe-22.kni.eng.rdu2.dc.redhat.com:5000/ran-test/oslat
# Force to fetch latest test image
imagePullPolicy: Always
resources:
limits:
cpu: 16
memory: 2Gi
requests:
cpu: 16
memory: 2Gi
env:
- name: tool
value: "oslat"
- name: RUNTIME_SECONDS
value: 1h
- name: INITIAL_DELAY_SEC
value: "30"
- name: PRIO
value: "1"
- name: delay
value: "60"
- name: manual
value: "n"
- name: TRACE_THRESHOLD
value: "20"
securityContext:
privileged: true
volumeMounts:
- mountPath: /dev/cpu_dma_latency
name: cstate
nodeSelector:
node-role.kubernetes.io/master: ""
volumes:
- name: cstate
hostPath:
path: /dev/cpu_dma_latency
Actual results:
oslat: Trace threshold (20 us) triggered on cpu 41 with 45 us!
Expected results:
All samples below 20us
Additional info:
trace file: http://registry.kni-qe-22.kni.eng.rdu2.dc.redhat.com:8080/images/sno.kni-qe-12.lab.eng.rdu2.redhat.com-oslat-kernel-trace.txt
- is caused by
-
RHEL-9148 Interrupt thread not affined after interrupt reaffined
-
- Closed
-