Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-25305

Tuned Profiles going degraded due to the extra net.core.rps_default_mask configuration in openshift-node-performance-xxx-profile

    XMLWordPrintable

Details

    • No
    • False
    • Hide

      None

      Show
      None
    • Hide
      * Previously, the Tuned profile reported `Degraded` condition after applying a PerformanceProfile due to trying to set a sysctl value a second time. With this release, the sysctl value is no longer set by tuned, instead it is only set by the `sysctl.d` file. (link:https://issues.redhat.com/browse/OCPBUGS-25305[*OCPBUGS-25305*])

      Issue:

      The Tuned profile reports Degraded condition after applying a PerformanceProfile.

      Cause:

      There are actually more issues in the error reporting than this bug, but one piece at a time. The generated Tuned profile is trying to set a sysctl value for the default RPS mask when it already did configure the same value via an /etc/sysctl.d file. Tuned warns about that and NTO treats that as a Degradation.

      TuneD daemon issued one or more error message(s) during profile application. TuneD stderr: net.core.rps_default_mask'

      Fix:

      The sysctl needs to be set early, so the duplication was solved by not setting the default RPS mask via tuned. The sysctl.d file was left in place as it applies early during boot.

      Result:

      Other sysctls might still cause the Degraded condition to appear, but there should no longer any trace of a failure to apply this specific sysctl.
      Show
      * Previously, the Tuned profile reported `Degraded` condition after applying a PerformanceProfile due to trying to set a sysctl value a second time. With this release, the sysctl value is no longer set by tuned, instead it is only set by the `sysctl.d` file. (link: https://issues.redhat.com/browse/OCPBUGS-25305 [* OCPBUGS-25305 *]) Issue: The Tuned profile reports Degraded condition after applying a PerformanceProfile. Cause: There are actually more issues in the error reporting than this bug, but one piece at a time. The generated Tuned profile is trying to set a sysctl value for the default RPS mask when it already did configure the same value via an /etc/sysctl.d file. Tuned warns about that and NTO treats that as a Degradation. TuneD daemon issued one or more error message(s) during profile application. TuneD stderr: net.core.rps_default_mask' Fix: The sysctl needs to be set early, so the duplication was solved by not setting the default RPS mask via tuned. The sysctl.d file was left in place as it applies early during boot. Result: Other sysctls might still cause the Degraded condition to appear, but there should no longer any trace of a failure to apply this specific sysctl.
    • Bug Fix

    Description

      This is a clone of issue OCPBUGS-25092. The following is the description of the original issue:

      This is a clone of issue OCPBUGS-24638. The following is the description of the original issue:

      Description of problem:
      Issue - Profiles are degraded [1]even after applied due to below [2]error:

      [1]

      $oc get profile -A
      NAMESPACE                                NAME                                          TUNED                APPLIED   DEGRADED   AGE
      openshift-cluster-node-tuning-operator   master0    rdpmc-patch-master   True      True       5d
      openshift-cluster-node-tuning-operator   master1    rdpmc-patch-master   True      True       5d
      openshift-cluster-node-tuning-operator   master2    rdpmc-patch-master   True      True       5d
      openshift-cluster-node-tuning-operator   worker0    rdpmc-patch-worker   True      True       5d
      openshift-cluster-node-tuning-operator   worker1    rdpmc-patch-worker   True      True       5d
      openshift-cluster-node-tuning-operator   worker10   rdpmc-patch-worker   True      True       5d
      openshift-cluster-node-tuning-operator   worker11   rdpmc-patch-worker   True      True       5d
      openshift-cluster-node-tuning-operator   worker12   rdpmc-patch-worker   True      True       5d
      openshift-cluster-node-tuning-operator   worker13   rdpmc-patch-worker   True      True       5d
      openshift-cluster-node-tuning-operator   worker14   rdpmc-patch-worker   True      True       5d
      openshift-cluster-node-tuning-operator   worker15   rdpmc-patch-worker   True      True       5d
      openshift-cluster-node-tuning-operator   worker2    rdpmc-patch-worker   True      True       5d
      openshift-cluster-node-tuning-operator   worker3    rdpmc-patch-worker   True      True       5d
      openshift-cluster-node-tuning-operator   worker4  rdpmc-patch-worker   True      True       5d
      openshift-cluster-node-tuning-operator   worker5    rdpmc-patch-worker   True      True       5d
      openshift-cluster-node-tuning-operator   worker6    rdpmc-patch-worker   True      True       5d
      openshift-cluster-node-tuning-operator   worker7    rdpmc-patch-worker   True      True       5d
      openshift-cluster-node-tuning-operator   worker8   rdpmc-patch-worker   True      True       5d
      openshift-cluster-node-tuning-operator   worker9   rdpmc-patch-worker   True      True       5d
      

      [2]

        lastTransitionTime: "2023-12-05T22:43:12Z"
          message: TuneD daemon issued one or more sysctl override message(s) during profile
            application. Use reapply_sysctl=true or remove conflicting sysctl net.core.rps_default_mask
          reason: TunedSysctlOverride
          status: "True"
      

      If we see in rdpmc-patch-master tuned:

      NAMESPACE                                NAME                                          TUNED                APPLIED   DEGRADED   AGE
      openshift-cluster-node-tuning-operator   master0    rdpmc-patch-master   True      True       5d
      openshift-cluster-node-tuning-operator   master1    rdpmc-patch-master   True      True       5d
      openshift-cluster-node-tuning-operator   master2    rdpmc-patch-master   True      True       5d
      

      We are configuring below in rdpmc-patch-master tuned:

      $ oc get tuned rdpmc-patch-master -n openshift-cluster-node-tuning-operator -oyaml |less
      spec:
        profile:
        - data: |
            [main]
            include=performance-patch-master
            [sysfs]
            /sys/devices/cpu/rdpmc = 2
          name: rdpmc-patch-master
        recommend:
      

      Below in Performance-patch-master which is included in above tuned:

      spec:
        profile:
        - data: |
            [main]
            summary=Custom tuned profile to adjust performance
            include=openshift-node-performance-master-profile
            [bootloader]
            cmdline_removeKernelArgs=-nohz_full=${isolated_cores}
      

      Below(which is coming in error) is in openshift-node-performance-master-profile included in above tuned:

      net.core.rps_default_mask=${not_isolated_cpumask}
      

      RHEL BUg has been raised for the same https://issues.redhat.com/browse/RHEL-18972

          Version-Release number of selected component (if applicable):{code:none}
      4.14
          

      Attachments

        Issue Links

          Activity

            People

              msivak@redhat.com Martin Sivak
              openshift-crt-jira-prow OpenShift Prow Bot
              Mallapadi Niranjan Mallapadi Niranjan
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: