Uploaded image for project: 'OpenShift Node'
  1. OpenShift Node
  2. OCPNODE-3806

Node Team to reconsider enabling PSI Metrics to help CNV Descheduler

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Done
    • Icon: Critical Critical
    • openshift-4.21
    • openshift-4.21
    • None
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • False
    • None
    • None
    • OCP Node Sprint 278 (blue)

      1. Summary

      The CNV Descheduler is currently failing due to a dependency on PSI metrics. These metrics were disabled for all types of OpenShift Nodes to ensure cyclictest for Real-Time (RT) kernels pass, creating a conflict between telco environments and CNV descheduler functionality. This regression has now triggered an investigation into conditionally re-enabling PSI metrics (probably based on the node type).

       

      Update: On further discussion with bwensley@redhat.com from Telco, there are low latency applications on non-RT kernels and hence the fix should not based on kernelType.

      2. History and Context

      PSI metrics were deliberately disabled in the cluster to address cyclictest issues on RT kernels. This fix appears to have introduced a regression in the CNV descheduler which relies on this data.

      • Original Reason for Disablement (OCPBUGS-37271): PSI metrics caused unacceptable latency overhead for latency-sensitive applications (Telco RAN DU deployments) running on RT kernels (5.14.0-427.22.1.el9_4.x86_64+rt kernel).
      • CNV Descheduler Dependency: The CNV descheduler was developed with the assumption that PSI metrics would be available and is currently having issues without them.
      • Timeline: The decision and implementation occurred in July 2024. The fix (psi=0) was verified and included in OCP 4.17.0-0.nightly-2024-07-25-212849.
      • Current Discussion: The CNV team is trying to use MCO's priority (97-worker-generated-kubelet MC containing psi=0) which makes manually overriding. But it seems difficult.

      3. Steps to Reproduce

      1. Ensure the environment has the configuration fix from MCO PR #4470 applied (i.e., PSI metrics are disabled/removed from monitoring/collection via the kernel command line, resulting in no /proc/pressure/cpu output or equivalent).
      2. TODO: Get the steps from the CNV team about the descheduler dependency

      4. Expected Behavior

      1. The descheduler should get PSI metrics
      2. The Telco team’s should still have it disabled

      5. Next Steps

      6. Dependent teams

      1. Performance team: Work with Telco team to re-run cyclictest after k8s 1.34 merge. Run the test on RT and non-RT kernels. There is already documentation available here: https://docs.redhat.com/en/documentation/openshift_container_platform/4.19/html/scalability_and_performance/cnf-performing-platform-verification-latency-tests#cnf-performing-end-to-end-tests-running-the-tests_cnf-latency-tests
      • CNV Team: After PSI metrics are enabled, remove any patches to re-enable it again.

      Questions:

      1. Get clarity on the failure seen in the descheduler. Can the logic in the descheduler be independently tested by the Node Team?

              rh-ee-ngopalak Neeraj Krishna Gopalakrishna
              rh-ee-ngopalak Neeraj Krishna Gopalakrishna
              None
              Peter Hunt
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated:
                Resolved: