-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
4.14.z
-
None
-
Important
-
No
-
CNF Ran Sprint 252, CNF Ran Sprint 253, CNF Ran Sprint 254, CNF RAN Sprint 255, CNF RAN Sprint 256, CNF RAN Sprint 257, CNF RAN Sprint 258, CNF RAN Sprint 259, CNF RAN Sprint 260, CNF RAN Sprint 262
-
10
-
False
-
-
-
Description of problem:
On SNO spoke with telco DU profile applied, oslat reported 45us latency spike on a 1h run
Version-Release number of selected component (if applicable):
4.14.20 local-storage-operator.v4.14.0-202403261739 cluster-logging.v5.9.0 packageserver ptp-operator.v4.14.0-202403222237 sriov-network-operator.v4.14.0-202402270139 sriov-fec.v2.8.0
How reproducible:
always
Steps to Reproduce:
1. Deploy DU node 2. Run OSLAT test pod [INFO] oslat git hash: ea82509d664d72992068c3a1fc41f9a66e2c3f99 [INFO] oslat image sha: sha256:4b568365d42fd6198aafa6d7ac61a2a6dc842521acb739f05647d5f9b36cca40 [INFO] Pod spec apiVersion: v1 kind: Pod metadata: name: oslat0 annotations: # Disable CPU balance with CRIO irq-load-balancing.crio.io: "disable" cpu-load-balancing.crio.io: "disable" cpu-quota.crio.io: "disable" labels: app: oslat spec: restartPolicy: Never runtimeClassName: performance-openshift-node-performance-profile affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - oslat topologyKey: "kubernetes.io/hostname" containers: - args: name: container-perf-tools image: registry.kni-qe-22.kni.eng.rdu2.dc.redhat.com:5000/ran-test/oslat # Force to fetch latest test image imagePullPolicy: Always resources: limits: cpu: 16 memory: 2Gi requests: cpu: 16 memory: 2Gi env: - name: tool value: "oslat" - name: RUNTIME_SECONDS value: 1h - name: INITIAL_DELAY_SEC value: "30" - name: PRIO value: "1" - name: delay value: "60" - name: manual value: "n" - name: TRACE_THRESHOLD value: "20" securityContext: privileged: true volumeMounts: - mountPath: /dev/cpu_dma_latency name: cstate nodeSelector: node-role.kubernetes.io/master: "" volumes: - name: cstate hostPath: path: /dev/cpu_dma_latency
Actual results:
oslat: Trace threshold (20 us) triggered on cpu 41 with 45 us!
Expected results:
All samples below 20us
Additional info:
trace file: http://registry.kni-qe-22.kni.eng.rdu2.dc.redhat.com:8080/images/sno.kni-qe-12.lab.eng.rdu2.redhat.com-oslat-kernel-trace.txt
- is caused by
-
RHEL-9148 Interrupt thread not affined after interrupt reaffined
- Closed