Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-5452

Taking much time to update node count for MCP

XMLWordPrintable

    • None
    • MCO Sprint 247, MCO Sprint 248
    • 2
    • False
    • Hide

      None

      Show
      None
    • Hide
      * Previously, when a node was removed from a `MachineConfigPool`, the Machine Config Operator (MCO) did not report an error or the removal of the node. The MCO does not support managing nodes when they are not in a pool and there was no indication that node management ceased after the node was removed. With this release, if a node is removed from all pools, the MCO now logs an error. (link:https://issues.redhat.com/browse/OCPBUGS-5452[*OCPBUGS-5452*])
      Show
      * Previously, when a node was removed from a `MachineConfigPool`, the Machine Config Operator (MCO) did not report an error or the removal of the node. The MCO does not support managing nodes when they are not in a pool and there was no indication that node management ceased after the node was removed. With this release, if a node is removed from all pools, the MCO now logs an error. (link: https://issues.redhat.com/browse/OCPBUGS-5452 [* OCPBUGS-5452 *])
    • Bug Fix
    • Done

      Description of problem:

      MCO taking too much time to update the node count for MCP when removing labels from node which MCP uses to match with nodes

      Version-Release number of selected component (if applicable):

       

      How reproducible:

      100%

      Steps to Reproduce:

      1. Remove `node-role.kubernetes.io/worker=` label from any worker node.
      ~~~
      # oc label node worker-0.sharedocp4upi411ovn.lab.upshift.rdu2.redhat.com node-role.kubernetes.io/worker-
      ~~~
      2. Check MCP worker for correct node count.
      ~~~
      # oc get mcp  worker
      NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
      worker   rendered-worker-6916abae250ad092875791f8297c13e1   True      False      False      3              3                   3                     0                      5d7h
      ~~~
      3. Check after 10-15 mins
      ~~~
      # oc get mcp  worker NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE worker   rendered-worker-6916abae250ad092875791f8297c13e1   True      False      False      2              2                   2                     0                      5d7h
      ~~~

      Actual results:

      It took 10-15 mins for MCP to detect node removal.

      Expected results:

      It should detect node removal as soon as the appropriate label from the node gets missing.

      Additional info:

       

              cdoern@redhat.com Charles Doern
              rhn-support-dpateriy Divyam Pateriya
              Rio Liu Rio Liu
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated:
                Resolved: