Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-53153

MCO requires reboot on worker nodes when creating new MC (even for MCP with zero machine count)

    • Moderate
    • None
    • False
    • Hide

      None

      Show
      None

      Description of problem:

       when creating a new MCP with MCselector specified by matchExpressions, it seems required that the value `worker` be part of the valid values of the label "machineconfiguration.openshift.io/role". When that is created even with no nodes with the specified label in nodeSelector, this creates a new MC and it triggers reboot on the worker nodes. 

      Version-Release number of selected component (if applicable):

      seen 4.18, but other versions were not tested    

      How reproducible:

          always on a fresh OCP. reapplying the reproduction steps with completely new MC and MCP doesn't trigger the reboot after the first time. 

      Steps to Reproduce:

          1.on a vanilla fresh OCP cluster that has worker pool create an mcp like below that no node is found to belong to it && no machine config (no node with node label as below) - so completly new mcp:
      apiVersion: machineconfiguration.openshift.io/v1
      kind: MachineConfigPool
      metadata:
        name: my-test
        labels:
          machineconfiguration.openshift.io/role: my-test
      spec:
        machineConfigSelector:
          matchExpressions:
            - {
                 key: machineconfiguration.openshift.io/role,
                 operator: In,
                 values: [worker,my-test],
              }
        nodeSelector:
          matchLabels:
            node-role.kubernetes.io/my-test: ""
      
           2.This creates an empty mcp, and a new MC but also triggers a reboot on the worker nodes. 

      Actual results:

          worker nodes are rebooted and rendered MC of the worker MCP is changed

      Expected results:

      the mcp has 0 machine count so it is anticipaed to have no reboots on any node + no change in the rendered MC of MCP worker.

      Additional info:

      deleting the added MCP triggers again reboot on the worker nodes
      And doesn't remove the corresponding MC, so if the MCP with the same MCselector is created again no reboot would happen on any node. Not sure if this is expected to keep an old MC. 

       

            [OCPBUGS-53153] MCO requires reboot on worker nodes when creating new MC (even for MCP with zero machine count)

            djoshy Shouldn't it be MCO warning/error that a node matches multiple MCPs?

            Shereen Haj added a comment - djoshy Shouldn't it be MCO warning/error that a node matches multiple MCPs?

            Jiri Mencak added a comment -

            Sounds like misconfiguration to me:

            profile cnfdr9.telco5g.eng.rdu2.redhat.com uses machineConfigLabels that match across multiple MCPs (my-test,worker,worker-cnf); this is not supported 

            Jiri Mencak added a comment - Sounds like misconfiguration to me: profile cnfdr9.telco5g.eng.rdu2.redhat.com uses machineConfigLabels that match across multiple MCPs (my-test,worker,worker-cnf); this is not supported

            Leaving undefined priority until we can access the conditions and impact.

            Yu Qi Zhang added a comment - Leaving undefined priority until we can access the conditions and impact.

            Shereen Haj added a comment -

            Shereen Haj added a comment - MCs and MCP yamls and must-gather can be found at: https://drive.google.com/drive/folders/1JD8VHZlJO95-nLz1Miox409C6_Nukj4u?usp=drive_link  

              team-nto Team NTO
              rhn-support-shajmakh Shereen Haj
              Sergio Regidor de la Rosa Sergio Regidor de la Rosa
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated: