Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-45263

Invalid PerformanceProfile applied successfully to the node

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Undefined Undefined
    • None
    • 4.17
    • Node Tuning Operator
    • None
    • None
    • CNF Compute Sprint 263
    • 1
    • False
    • Hide

      None

      Show
      None
    • Hide
      This fix will eliminate the possibility for a user to enter an invalid string for any cpuset, thereby preventing breaking the cluster.
      Show
      This fix will eliminate the possibility for a user to enter an invalid string for any cpuset, thereby preventing breaking the cluster.
    • Bug Fix
    • In Progress

      Description of problem:

      PerformanceProfile with invalid cpu config is applied successfully to the node and updated in the relevant files, crushing the node.

      Version-Release number of selected component (if applicable):

      4.17.0-0.nightly-2024-11-09-011306    

      How reproducible:

      Apply a PerformanceProfile with invalid cpu values

      Steps to Reproduce:

      Apply the following PerformanceProfile with invalid cpu values:
      
      apiVersion: performance.openshift.io/v2
      kind: PerformanceProfile
      metadata:
        name: customcnf
      spec:
        cpu:
          isolated: '{{hub fromConfigMap "" "hw-types" (printf "%s-cnf-isolated-cpu" .ManagedClusterName) | toLiteral hub}}'
          reserved: '{{hub fromConfigMap "" "hw-types" (printf "%s-cnf-reserved-cpu" .ManagedClusterName) | toLiteral hub}}'
        hugepages:
          defaultHugepagesSize: 1G
          # TODO(yprokule): update to use hub side templating
          pages:
          - size: 1G
            count: 6
            node: 0
          - size: 1G
            count: 6
            node: 1
        machineConfigPoolSelector:
          pools.operator.machineconfiguration.openshift.io/customcnf: ""
        nodeSelector:
          node-role.kubernetes.io/customcnf: ""     

      Actual results:

      The invalid cpu config is applied to the node and updated in the relevant files, crushing the node. For example:
      
      E1202 12:52:11.652945  147887 on_disk_validation.go:251] content mismatch for file "/etc/kubernetes/kubelet.conf" (-want +got):
        []uint8(
              """
              ... // 135 identical lines
              podPidsLimit: 4096
              protectKernelDefaults: true
      -       reservedSystemCPUs: '{{hub fromConfigMap "" "hw-types" (printf "%s-cnf-reserved-cpu"
      -         .ManagedClusterName) | toLiteral hub}}'
              rotateCertificates: true
              runtimeRequestTimeout: 0s
              ... // 19 identical lines
              """
        )
      E1202 12:52:11.653016  147887 writer.go:226] Marking Degraded due to: unexpected on-disk state validating against rendered-customcnf-fb52a08022b903f6f39e02f50eb0488c: content mismatch for file "/etc/kubernetes/kubelet.conf"
      I1202 12:52:33.468476  147887 certificate_writer.go:303] Certificate was synced from controllerconfig resourceVersion 15959000
      
      

      Expected results:

      The invalid config should not be applied and the validation webhook should return an error stating the cpu setup is invalid    

              rh-ee-rbaturov Ronny Baturov
              saledort@redhat.com Sabina Aledort
              Mallapadi Niranjan Mallapadi Niranjan
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: