Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-52484

[4.18] Tuned profile degraded in ARM on Vendor Id not matching Ampere (APM)

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • 4.18.0, 4.19
    • Node Tuning Operator
    • None
    • None
    • False
    • Hide

      None

      Show
      None
    • Hide
      Cause: Ampere ARM based cpus use a different cpu Vendor Id identifier than other ARMs (APM).

      Issue: The per platform tuning was matching on Vendor Id and did not identify machines with such CPUs as ARM based.

      Fix: For now, the ARM detection was changed to use the Architecture field instead - aarch64 as we use the same tuning for all vendors.

      Effect: Machines with Ampere cpus should be tuned properly now.
      Show
      Cause: Ampere ARM based cpus use a different cpu Vendor Id identifier than other ARMs (APM). Issue: The per platform tuning was matching on Vendor Id and did not identify machines with such CPUs as ARM based. Fix: For now, the ARM detection was changed to use the Architecture field instead - aarch64 as we use the same tuning for all vendors. Effect: Machines with Ampere cpus should be tuned properly now.
    • Bug Fix
    • In Progress

      Description of problem:

          Applying a performance profile on an ARM cluster, results with the tuned profile to turn degraded 

      Version-Release number of selected component (if applicable):

          

      How reproducible:

      Always

      Steps to Reproduce:

      1. Label a worker node with a worker-cnf label
      2. Create an mcp referring to that label 
      3. Apply the below performance profile
      
      apiVersion: performance.openshift.io/v2
      kind: PerformanceProfile
      metadata:
        name: performance
      spec:
        cpu:
          isolated: "1-3,4-6"
          reserved: "0,7"
        hugepages:
          defaultHugepagesSize: 512M
          pages:
          - count: 1
            node: 0
            size: 512M
          - count: 128
            node: 1
            size: 2M
        machineConfigPoolSelector:
          machineconfiguration.openshift.io/role: worker-cnf
        net:
          userLevelNetworking: true
        nodeSelector:
          node-role.kubernetes.io/worker-cnf: ''
        kernelPageSize: 64k
        numa:
          topologyPolicy: single-numa-node
        realTimeKernel:
          enabled: false
        workloadHints:
          highPowerConsumption: true
          perPodPowerManagement: false
          realTime: true

      Actual results:

          

      Expected results:

          

      Additional info:

      [root@ampere-one-x-04 ~]# oc get profiles -A
      NAMESPACE                                NAME                                             TUNED                                    APPLIED   DEGRADED   MESSAGE                                                            AGE
      openshift-cluster-node-tuning-operator   ocp-ctlplane-0.libvirt.lab.eng.tlv2.redhat.com   openshift-control-plane                  True      False      TuneD profile applied.                                             22h
      openshift-cluster-node-tuning-operator   ocp-ctlplane-1.libvirt.lab.eng.tlv2.redhat.com   openshift-control-plane                  True      False      TuneD profile applied.                                             22h
      openshift-cluster-node-tuning-operator   ocp-ctlplane-2.libvirt.lab.eng.tlv2.redhat.com   openshift-control-plane                  True      False      TuneD profile applied.                                             22h
      openshift-cluster-node-tuning-operator   ocp-worker-0.libvirt.lab.eng.tlv2.redhat.com     openshift-node-performance-performance   False     True       The TuneD daemon profile not yet applied, or application failed.   22h
      openshift-cluster-node-tuning-operator   ocp-worker-1.libvirt.lab.eng.tlv2.redhat.com     openshift-node                           True      False      TuneD profile applied.                                             22h
      openshift-cluster-node-tuning-operator   ocp-worker-2.libvirt.lab.eng.tlv2.redhat.com     openshift-node                           True      False      TuneD profile applied.                                             22h
      
      [root@ampere-one-x-04 ~]# oc describe performanceprofile
      Name:         performance
      Namespace:
      Labels:       <none>
      Annotations:  <none>
      API Version:  performance.openshift.io/v2
      Kind:         PerformanceProfile
      Metadata:
        Creation Timestamp:  2025-03-04T15:28:44Z
        Finalizers:
          foreground-deletion
        Generation:        1
        Resource Version:  74234
        UID:               0d9c1817-c12f-4ea8-9c4b-b37badc232e9
      Spec:
        Cpu:
          Isolated:  1-3,4-6
          Reserved:  0,7
        Hugepages:
          Default Hugepages Size:  512M
          Pages:
            Count:         1
            Node:          0
            Size:          512M
            Count:         128
            Node:          1
            Size:          2M
        Kernel Page Size:  64k
        Machine Config Pool Selector:
          machineconfiguration.openshift.io/role:  worker-cnf
        Net:
          User Level Networking:  true
        Node Selector:
          node-role.kubernetes.io/worker-cnf:
        Numa:
          Topology Policy:  single-numa-node
        Real Time Kernel:
          Enabled:  false
        Workload Hints:
          High Power Consumption:    true
          Per Pod Power Management:  false
          Real Time:                 true
      Status:
        Conditions:
          Last Heartbeat Time:   2025-03-04T15:28:45Z
          Last Transition Time:  2025-03-04T15:28:45Z
          Status:                False
          Type:                  Available
          Last Heartbeat Time:   2025-03-04T15:28:45Z
          Last Transition Time:  2025-03-04T15:28:45Z
          Status:                False
          Type:                  Upgradeable
          Last Heartbeat Time:   2025-03-04T15:28:45Z
          Last Transition Time:  2025-03-04T15:28:45Z
          Status:                False
          Type:                  Progressing
          Last Heartbeat Time:   2025-03-04T15:28:45Z
          Last Transition Time:  2025-03-04T15:28:45Z
          Message:               Tuned ocp-worker-0.libvirt.lab.eng.tlv2.redhat.com Degraded Reason: TunedError.
      Tuned ocp-worker-0.libvirt.lab.eng.tlv2.redhat.com Degraded Message: TuneD daemon issued one or more error message(s) during profile application. TuneD stderr: .
      
          Reason:       TunedProfileDegraded
          Status:       True
          Type:         Degraded
        Runtime Class:  performance-performance
        Tuned:          openshift-cluster-node-tuning-operator/openshift-node-performance-performance
      Events:
        Type    Reason              Age                 From                            Message
        ----    ------              ----                ----                            -------                                                                                                       Normal  Creation succeeded  112m (x9 over 17h)  performance-profile-controller  Succeeded to create all components
      
      [root@ampere-one-x-04 ~]# oc logs pod/tuned-kjc8j
      I0304 15:35:50.346412    3259 controller.go:1666] starting in-cluster ocp-tuned v4.19.0-202502262344.p0.gf166846.assembly.stream.el9-0-g0d9dd16-dirty
      I0304 15:35:50.401840    3259 controller.go:671] writing /var/lib/ocp-tuned/image.env
      I0304 15:35:50.418669    3259 controller.go:702] tunedRecommendFileRead(): read "openshift-node-performance-performance" from "/etc/tuned/recommend.d/50-openshift.conf"
      I0304 15:35:50.419585    3259 controller.go:1728] starting: profile unpacked is "openshift-node-performance-performance" fingerprint "ab0d99d8009d6539b91ed1aeff3e4fa1c629c1cd4e9a32bdc132dcc9737e4fc9"
      I0304 15:35:50.419646    3259 controller.go:1424] recover: no pending deferred change
      I0304 15:35:50.419666    3259 controller.go:1734] starting: no pending deferred update
      I0304 15:36:06.074575    3259 controller.go:382] disabling system tuned...
      I0304 15:36:06.121045    3259 controller.go:1546] started events processors
      I0304 15:36:06.121492    3259 controller.go:359] set log level 0
      I0304 15:36:06.121850    3259 controller.go:1567] monitoring filesystem events on "/etc/tuned/bootcmdline"
      I0304 15:36:06.121886    3259 controller.go:1570] started controller
      I0304 15:36:06.122603    3259 controller.go:692] tunedRecommendFileWrite(): written "/etc/tuned/recommend.d/50-openshift.conf" to set TuneD profile openshift-node-performance-performance
      I0304 15:36:06.122634    3259 controller.go:417] profilesExtract(): extracting 6 TuneD profiles (recommended=openshift-node-performance-performance)
      I0304 15:36:06.210862    3259 controller.go:462] profilesExtract(): recommended TuneD profile openshift-node-performance-performance content unchanged [openshift]
      I0304 15:36:06.211950    3259 controller.go:462] profilesExtract(): recommended TuneD profile openshift-node-performance-performance content unchanged [openshift-node-performance-performance]
      I0304 15:36:06.212311    3259 controller.go:478] profilesExtract(): fingerprint of extracted profiles: "ab0d99d8009d6539b91ed1aeff3e4fa1c629c1cd4e9a32bdc132dcc9737e4fc9"
      I0304 15:36:06.212389    3259 controller.go:818] tunedReload()
      I0304 15:36:06.212493    3259 controller.go:745] starting tuned...
      I0304 15:36:06.212547    3259 run.go:121] running cmd...
      2025-03-04 15:36:06,335 INFO     tuned.daemon.application: TuneD: 2.25.1, kernel: 5.14.0-570.el9.aarch64+64k
      2025-03-04 15:36:06,335 INFO     tuned.daemon.application: dynamic tuning is globally disabled
      2025-03-04 15:36:06,340 INFO     tuned.daemon.daemon: using sleep interval of 1 second(s)
      2025-03-04 15:36:06,340 INFO     tuned.daemon.daemon: Running in automatic mode, checking what profile is recommended for your configuration.
      2025-03-04 15:36:06,341 INFO     tuned.daemon.daemon: Using 'openshift-node-performance-performance' profile
      2025-03-04 15:36:06,342 INFO     tuned.profiles.loader: loading profile: openshift-node-performance-performance
      2025-03-04 15:36:06,460 ERROR    tuned.daemon.daemon: Cannot set initial profile. No tunings will be enabled: Cannot load profile(s) 'openshift-node-performance-performance': Cannot find profile 'openshift-node-performance--aarch64-performance' in '['/var/lib/ocp-tuned/profiles', '/usr/lib/tuned', '/usr/lib/tuned/profiles']'.
      2025-03-04 15:36:06,461 INFO     tuned.daemon.controller: starting controller
      
      sh-5.1# systemctl --no-pager | grep hugepages
        dev-hugepages.mount                                                                                                                                             loaded active mounted   Huge Pages File System
      ● hugepages-allocation-2048kB-NUMA1.service                                                                                                                       loaded failed failed    Hugepages-2048kB allocation on the node 1
        hugepages-allocation-524288kB-NUMA0.service                                                                                                                     loaded active exited    Hugepages-524288kB allocation on the node 0
      
      sh-5.1# systemctl status hugepages-allocation-2048kB-NUMA1.service
      × hugepages-allocation-2048kB-NUMA1.service - Hugepages-2048kB allocation on the node 1
           Loaded: loaded (/etc/systemd/system/hugepages-allocation-2048kB-NUMA1.service; enabled; preset: disabled)
           Active: failed (Result: exit-code) since Tue 2025-03-04 15:32:33 UTC; 17h ago
         Main PID: 1002 (code=exited, status=1/FAILURE)
              CPU: 6ms
      
      Mar 04 15:32:33 ocp-worker-0.libvirt.lab.eng.tlv2.redhat.com systemd[1]: Starting Hugepages-2048kB allocation on the node 1...
      Mar 04 15:32:33 ocp-worker-0.libvirt.lab.eng.tlv2.redhat.com hugepages-allocation.sh[1002]: ERROR: /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages does not exist
      Mar 04 15:32:33 ocp-worker-0.libvirt.lab.eng.tlv2.redhat.com systemd[1]: hugepages-allocation-2048kB-NUMA1.service: Main process exited, code=exited, status=1/FAILURE
      Mar 04 15:32:33 ocp-worker-0.libvirt.lab.eng.tlv2.redhat.com systemd[1]: hugepages-allocation-2048kB-NUMA1.service: Failed with result 'exit-code'.
      Mar 04 15:32:33 ocp-worker-0.libvirt.lab.eng.tlv2.redhat.com systemd[1]: Failed to start Hugepages-2048kB allocation on the node 1.
      
      sh-5.1# cat /proc/cmdline
      BOOT_IMAGE=(hd0,gpt3)/boot/ostree/rhcos-e032e3de5cffeccaf88bc5dc1945da35b4273c5f5b758a6ca1d0d78344b55e7f/vmlinuz-5.14.0-570.el9.aarch64+64k rw ostree=/ostree/boot.0/rhcos/e032e3de5cffeccaf88bc5dc1945da35b4273c5f5b758a6ca1d0d78344b55e7f/0 ignition.platform.id=openstack console=ttyAMA0,115200n8 console=tty0 root=UUID=96763b3b-e217-4879-a03e-56568ca84bf9 rw rootflags=prjquota boot=UUID=d98055a6-2355-40d3-8e87-98eedd0e8c91 systemd.unified_cgroup_hierarchy=1 cgroup_no_v1=all psi=0
       bash-5.1# ls /var/lib/ocp-tuned/profiles/
      openshift                                           openshift-node-performance-intel-x86-performance
      openshift-node-performance-amd-x86-performance      openshift-node-performance-performance
      openshift-node-performance-arm-aarch64-performance  openshift-node-performance-rt-performance
      bash-5.1# cat /var/lib/ocp-tuned/profiles/openshift-node-performance-performance/tuned.conf
      [main]
      summary=Openshift node optimized for deterministic performance at the cost of increased power consumption, focused on low latency network performance. Based on Tuned 2.11 and Cluster node tuning (oc 4.5)
      # The final result of the include depends on cpu vendor, cpu architecture, and whether the real time kernel is enabled
      # The first line will be evaluated based on the CPU vendor and architecture
      # This has three possible results:
      #   include=openshift-node-performance-amd-x86;
      #   include=openshift-node-performance-arm-aarch64;
      #   include=openshift-node-performance-intel-x86;
      # The second line will be evaluated based on whether the real time kernel is enabled
      # This has two possible results:
      #     openshift-node,cpu-partitioning
      #     openshift-node,cpu-partitioning,openshift-node-performance-rt-<PerformanceProfile name>
      include=openshift-node,cpu-partitioning${f:regex_search_ternary:${f:exec:uname:-r}:rt:,openshift-node-performance-rt-performance:};
          openshift-node-performance-${f:lscpu_check:Vendor ID\:\s*GenuineIntel:intel:Vendor ID\:\s*AuthenticAMD:amd:Vendor ID\:\s*ARM:arm}-${f:lscpu_check:Architecture\:\s*x86_64:x86:Architecture\:\s*aarch64:aarch64}-performance
      # Inheritance of base profiles legend:
      # cpu-partitioning -> network-latency -> latency-performance
      # https://github.com/redhat-performance/tuned/blob/master/profiles/latency-performance/tuned.conf
      # https://github.com/redhat-performance/tuned/blob/master/profiles/network-latency/tuned.conf
      # https://github.com/redhat-performance/tuned/blob/master/profiles/cpu-partitioning/tuned.conf
      # All values are mapped with a comment where a parent profile contains them.
      # Different values will override the original values in parent profiles.
      [variables]
      #> isolated_cores take a list of ranges; e.g. isolated_cores=2,4-7
      isolated_cores=1-6
      
      not_isolated_cores_expanded=${f:cpulist_invert:${isolated_cores_expanded}}
      
      [cpu]
      #> latency-performance
      #> (override)
      force_latency=cstate.id:1|3
      governor=performance
      energy_perf_bias=performance
      min_perf_pct=100
       
      [service]
      service.stalld=start,enable
      
      [vm]
      #> network-latency
      transparent_hugepages=never
      
      [irqbalance]
      # Disable the plugin entirely, which was enabled by the parent profile `cpu-partitioning`.
      # It can be racy if TuneD restarts for whatever reason.
      #> cpu-partitioning
      enabled=false
      
      [scheduler]
      runtime=0
      group.ksoftirqd=0:f:11:*:ksoftirqd.*
      group.rcuc=0:f:11:*:rcuc.*
      group.ktimers=0:f:11:*:ktimers.*
      default_irq_smp_affinity = ignore
      irq_process=false
      
      [sysctl]
      #> cpu-partitioning #RealTimeHint
      kernel.hung_task_timeout_secs=600
      #> cpu-partitioning #RealTimeHint
      kernel.nmi_watchdog=0
      #> RealTimeHint
      kernel.sched_rt_runtime_us=-1
      #> cpu-partitioning  #RealTimeHint
      vm.stat_interval=10
      # cpu-partitioning and RealTimeHint for RHEL disable it (= 0)
      # OCP is too dynamic when partitioning and needs to evacuate
      #> scheduled timers when starting a guaranteed workload (= 1)
      kernel.timer_migration=1
      #> network-latency
      net.ipv4.tcp_fastopen=3
      # If a workload mostly uses anonymous memory and it hits this limit, the entire
      # working set is buffered for I/O, and any more write buffering would require
      # swapping, so it's time to throttle writes until I/O can catch up.  Workloads
      # that mostly use file mappings may be able to use even higher values.
      #
      # The generator of dirty data starts writeback at this percentage (system default
      # is 20%)
      #> latency-performance
      vm.dirty_ratio=10
      # Start background writeback (via writeback threads) at this percentage (system
      # default is 10%)
      #> latency-performance
      vm.dirty_background_ratio=3
      # The swappiness parameter controls the tendency of the kernel to move
      # processes out of physical memory and onto the swap disk.
      # 0 tells the kernel to avoid swapping processes out of physical memory
      # for as long as possible
      # 100 tells the kernel to aggressively swap processes out of physical memory
      # and move them to swap cache
      #> latency-performance
      vm.swappiness=10
      # also configured via a sysctl.d file
      # placed here for documentation purposes and commented out due
      # to a tuned logging bug complaining about duplicate sysctl:
      #   https://issues.redhat.com/browse/RHEL-18972
      #> rps configuration
      # net.core.rps_default_mask=${not_isolated_cpumask}
      
      [selinux]
      #> Custom (atomic host)
      avc_cache_threshold=8192
      
      [net]
      channels=combined 2
      nf_conntrack_hashsize=131072
      
      [bootloader]
      # !! The names are important for Intel and are referenced in openshift-node-performance-intel-x86
      # set empty values to disable RHEL initrd setting in cpu-partitioning
      initrd_remove_dir=
      initrd_dst_img=
      initrd_add_dir=
      # overrides cpu-partitioning cmdline
      cmdline_cpu_part=+nohz=on rcu_nocbs=${isolated_cores} tuned.non_isolcpus=${not_isolated_cpumask} systemd.cpu_affinity=${not_isolated_cores_expanded}
      # No default value but will be composed conditionally based on platform
      cmdline_iommu=
      
      cmdline_isolation=+isolcpus=managed_irq,${isolated_cores}
       
      cmdline_realtime_nohzfull=+nohz_full=${isolated_cores}
      cmdline_realtime_nosoftlookup=+nosoftlockup
      cmdline_realtime_common=+skew_tick=1 rcutree.kthread_prio=11
       
      # No default value but will be composed conditionally based on platform
      cmdline_power_performance=
       
      # No default value but will be composed conditionally based on platform
      cmdline_idle_poll=
       
       
      
      [rtentsk]

              msivak@redhat.com Martin Sivak
              rh-ee-rshemtov Roy Shemtov
              Roy Shemtov Roy Shemtov
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: