• Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • 4.14.z
    • Telco Performance
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • No
    • Hide
      2025/03/03: Fix is now available in RHEL 9.2 (RHEL-68397) and should be in OCP 4.14.45 and later z-streams. Ready to be retested.
      2024/11/18: Fix is now available in RHEL 9 (RHEL-9148) - backports will be required to RHEL 9.4 (OCP 4.16) and RHEL 9.2 (OCP 4.14).
      2024/11/1: No change - still waiting.
      2024/9/24: Still waiting for fix to be verified (RHEL-9148 ) - needs backport to 9.2.0.z
      Show
      2025/03/03: Fix is now available in RHEL 9.2 (RHEL-68397) and should be in OCP 4.14.45 and later z-streams. Ready to be retested. 2024/11/18: Fix is now available in RHEL 9 ( RHEL-9148 ) - backports will be required to RHEL 9.4 (OCP 4.16) and RHEL 9.2 (OCP 4.14). 2024/11/1: No change - still waiting. 2024/9/24: Still waiting for fix to be verified ( RHEL-9148 ) - needs backport to 9.2.0.z
    • None
    • None
    • None
    • CNF RAN Sprint 277, CNF RAN Sprint 278, CNF RAN Sprint 279, CNF RAN Sprint 280, CNF RAN Sprint 281, CNF RAN Sprint 282, CNF RAN Sprint 283, CNF RAN Sprint 284
    • 8
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

          On SNO spoke with telco DU profile applied, oslat reported 45us latency spike on a 1h run
      

      Version-Release number of selected component (if applicable):

          4.14.20
      
      local-storage-operator.v4.14.0-202403261739 
      cluster-logging.v5.9.0                      
      packageserver                               
      ptp-operator.v4.14.0-202403222237           
      sriov-network-operator.v4.14.0-202402270139 
      sriov-fec.v2.8.0                            
      

      How reproducible:

          always

      Steps to Reproduce:

          1. Deploy DU node
          2. Run OSLAT test pod
      
      [INFO] oslat git hash: ea82509d664d72992068c3a1fc41f9a66e2c3f99
      [INFO] oslat image sha: sha256:4b568365d42fd6198aafa6d7ac61a2a6dc842521acb739f05647d5f9b36cca40
      [INFO] Pod spec
      apiVersion: v1
      kind: Pod
      metadata:
        name: oslat0
        annotations:
          # Disable CPU balance with CRIO
          irq-load-balancing.crio.io: "disable"
          cpu-load-balancing.crio.io: "disable"
          cpu-quota.crio.io: "disable"
        labels:
          app: oslat
      spec:
        restartPolicy: Never
        runtimeClassName: performance-openshift-node-performance-profile
        affinity:
          podAntiAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                   - oslat
              topologyKey: "kubernetes.io/hostname"
        containers:
        - args:
          name: container-perf-tools
          image: registry.kni-qe-22.kni.eng.rdu2.dc.redhat.com:5000/ran-test/oslat
          # Force to fetch latest test  image
          imagePullPolicy: Always
          resources:
            limits:
              cpu: 16
              memory: 2Gi
            requests:
              cpu: 16
              memory: 2Gi
          env:
          - name: tool
            value: "oslat"
          - name: RUNTIME_SECONDS
            value: 1h
          - name: INITIAL_DELAY_SEC
            value: "30"
          - name: PRIO
            value: "1"
          - name: delay
            value: "60"
          - name: manual
            value: "n"
          - name: TRACE_THRESHOLD
            value: "20"
          securityContext:
            privileged: true
          volumeMounts:
          - mountPath: /dev/cpu_dma_latency
            name: cstate
        nodeSelector:
          node-role.kubernetes.io/master: ""
        volumes:
        - name: cstate
          hostPath:
            path: /dev/cpu_dma_latency

      Actual results:

          oslat: Trace threshold (20 us) triggered on cpu 41 with 45 us!
      

      Expected results:

              All samples below 20us
      

      Additional info:

          trace file: http://registry.kni-qe-22.kni.eng.rdu2.dc.redhat.com:8080/images/sno.kni-qe-12.lab.eng.rdu2.redhat.com-oslat-kernel-trace.txt

              rh-ee-cshulyup Costa Shulyupin
              mcornea@redhat.com Marius Cornea
              None
              None
              Marius Cornea Marius Cornea
              None
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated: