Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-60913

After scale down the last node has ToBeDeletedByClusterAutoscaler taint if creating a new machineset

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 4.17.z
    • 4.17.z, 4.18.z, 4.19.z
    • Cluster Autoscaler
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • 0
    • Important
    • None
    • None
    • AUTOSCALE - Sprint 276, AUTOSCALE - Sprint 277
    • 2
    • In Progress
    • Bug Fix
    • Hide
      Previously, the cluster autoscaler would attempt to include Machine objects in a deleting state which led to a condition where the cluster autoscaler's count of Machines was inaccurate. The inaccuracy in Machine count would make the cluster autoscaler add additional taints that were not needed. Now, the autoscaler will properly count the Machines.
      Show
      Previously, the cluster autoscaler would attempt to include Machine objects in a deleting state which led to a condition where the cluster autoscaler's count of Machines was inaccurate. The inaccuracy in Machine count would make the cluster autoscaler add additional taints that were not needed. Now, the autoscaler will properly count the Machines.
    • None
    • None
    • None
    • None

      Description of problem:

      If we create a new machineset, and scale up and down the cluster, the last node always has ToBeDeletedByClusterAutoscaler taints.    

      Version-Release number of selected component (if applicable):

      4.19.0-0.nightly-2025-07-29-193521

      How reproducible:

      Always

      Steps to Reproduce:

          1. Create a new machineset
          2. Create clusterautoscaler and machineautoscaler  
          3. Add workload to scale up cluster
          4. After cluster is stable, remove workload
          5. Check the last node taints     

      Actual results:

      After scale down, the last node always has ToBeDeletedByClusterAutoscaler taint, the last machine has delete-machine annotation. 
      
      $ oc get node       
      NAME                                        STATUS                     ROLES                  AGE     VERSION
      ip-10-0-10-3.us-east-2.compute.internal     Ready                      worker                 6h26m   v1.30.14
      ip-10-0-46-133.us-east-2.compute.internal   Ready                      worker                 53m     v1.30.14
      ip-10-0-58-23.us-east-2.compute.internal    Ready                      control-plane,master   6h33m   v1.30.14
      ip-10-0-78-119.us-east-2.compute.internal   Ready                      worker                 40m     v1.30.14
      ip-10-0-8-143.us-east-2.compute.internal    Ready                      control-plane,master   6h33m   v1.30.14
      ip-10-0-84-9.us-east-2.compute.internal     Ready,SchedulingDisabled   worker                 3h7m    v1.30.14
      ip-10-0-88-62.us-east-2.compute.internal    Ready                      control-plane,master   6h33m   v1.30.14
      $ oc get node ip-10-0-78-119.us-east-2.compute.internal -o yaml | grep ToB  
          key: ToBeDeletedByClusterAutoscaler
      
      
      $ oc get machine zhsunaws419-z45gt-worker1-gq8bb -o yaml           
      apiVersion: machine.openshift.io/v1beta1
      kind: Machine
      metadata:
        annotations:
          machine.openshift.io/cluster-api-delete-machine: 2025-07-30 16:28:51.457550971
            +0000 UTC m=+914.784789272
          machine.openshift.io/delete-machine: 2025-07-30 16:28:51.457547011 +0000 UTC m=+914.784785329
          machine.openshift.io/instance-state: running

      Expected results:

      After scale down, the last node no ToBeDeletedByClusterAutoscaler taint       

      Additional info:

      The original bug https://issues.redhat.com/browse/OCPBUGS-54231

              mimccune@redhat.com Michael McCune
              rhn-support-zhsun Zhaohua Sun
              None
              None
              Paul Rozehnal Paul Rozehnal
              None
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: