Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-36314

NROP: Cannot get a working configuration when using schedulable control plane

XMLWordPrintable

    • No
    • CNF Compute Sprint 255
    • 1
    • False
    • Hide

      None

      Show
      None
    • Release Note Not Required

      Description of problem:

      On a cluster with schedulable control plane nodes, creating a NUMAResourcesOperator resource with the default machineConfigPoolSelector from the documentation:
      
       - machineConfigPoolSelector:
            matchLabels:
              pools.operator.machineconfiguration.openshift.io/worker: "" 
      
      A MachineConfig resource named 51-numaresourcesoperator-worker is created, and worker-only nodes are restarted, since the MC is added to the worker MCP. 
      
      After worker nodes are rebooted, there is one numaresourcesoperator-worker-***** pod for each node, both masters and workesr. However, pods associated to master nodes fail to start, because the rte SELinux policy is not created.
      
      If the machineConfigPoolSelector is changed to:
      
        - machineConfigPoolSelector:
            pools.operator.machineconfiguration.openshift.io/worker: ""
      
      Or two nodeGroups are created (one for the master MCP and one for the worker MCP). another MachineConfig resource is created, named 51-numaresourcesoperator-master. All master nodes are rebooted, and all numaresourcesoperator-worker-***** pods start successfully after that. However,  the operator also creates a set of numaresourcesoperator-master-***** pods, so we have two resource-topology-exporter pods per node, which is an issue for the operator.

      Version-Release number of selected component (if applicable):

      Seen with Openshift 4.14.26 and NUMA Resources Operator 4.14.5

      How reproducible:

      Always

      Steps to Reproduce:

          1. Deploy a cluster with schedulable control plane nodes (for example, a 3+1 compact cluster).
          2. Create NUMAResourcesOperator resource with the machineConfigPoolSelector described above
          

      Actual results:

      Cannot get a single resource-topology-exporter pod running successfully on control plane nodes

      Expected results:

      There is a single resource-topology-exporter pod running successfully on control plane nodes

      Additional info:

      A partner has hit this issue. We have a cluster in the lab where we can reproduce it and run any tests.

              fromani@redhat.com Francesco Romani
              jpena@redhat.com Javier Pena
              Shereen Haj Shereen Haj
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated:
                Resolved: