Uploaded image for project: 'OpenShift Request For Enhancement'
  1. OpenShift Request For Enhancement
  2. RFE-8869

[HCP] | Guest Cluster Node-Level Deletion Preference for HyperShift NodePool Scale-Down

XMLWordPrintable

    • Icon: Feature Request Feature Request
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • Hosted Control Planes
    • None
    • Product / Portfolio Work
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      1. Proposed title of this feature request

      Guest Cluster Node-Level Deletion Preference for HyperShift NodePool Scale-Down

      2. What is the nature and description of the request?

      In HyperShift Hosted Control Plane (HCP) environments, when a NodePool is scaled down, the CAPI MachineSet controller selects which Machine (and corresponding guest cluster worker node) to remove. Today, the only way to influence this selection is by annotating the Machine object with cluster.x-k8s.io/delete-machine=yes in the management cluster namespace.

      This request asks for the ability for guest cluster administrators to signal — from within the guest cluster itself — which worker nodes should be preferentially removed during a NodePool scale-down operation.

      Currently, no guest cluster-level signals are considered during deletion priority evaluation:
      Cordoning a node (oc adm cordon <node>) — not considered. A cordoned node is not prioritized for deletion.

      Tainting a node — not considered. No taint-based deletion priority exists.

      When none of these signals influence deletion priority, all machines receive equal priority, and the tiebreaker is alphabetical by Machine name — a non-intuitive and operationally unpredictable outcome.

      The request is to introduce a mechanism (such as a specific node annotation, taint, or cordon awareness) that the NodePool or CAPI controller would honor during scale-down to preferentially delete the signaled node. This could take the form of:

      • A guest-side node annotation (e.g., hypershift.openshift.io/prefer-delete) that gets propagated to the corresponding Machine's cluster.x-k8s.io/delete-machine annotation
      • Extending the CAPI deletion priority logic to read guest node state (taints, spec.unschedulable, or specific annotations)
      • Extending the CAPI Machine controller to watch the corresponding guest cluster Node for specific taints or annotations, and reflect them as Machine-level conditions or annotations that the existing deletion priority logic can consume. For example, if a guest cluster admin taints a node with node.hypershift.openshift.io/prefer-delete:NoSchedule, the Machine controller would detect this and set the cluster.x-k8s.io/delete-machine annotation on the corresponding Machine, making it the highest priority candidate for deletion under the existing CAPI logic

       

      3. Why does the customer need this? (List the business requirements here)

      Separation of responsibilities: In many organizations, the management cluster and guest clusters are operated by different teams. The management cluster team owns the HyperShift operator, HostedClusters, NodePools, and CAPI Machine objects.

      The guest cluster team owns workloads, nodes, and day-2 operations. Today, removing a specific unhealthy or problematic node requires the guest cluster team to escalate to the management cluster team to annotate the correct Machine — breaking the self-service operational model.

      Operational efficiency: When a guest cluster admin identifies a degraded node (hardware issues, kernel problems, stuck workloads, kubelet failures), they need to remove it quickly. The current process requires cross-team coordination, ticket creation, and waiting for the management cluster team to act. This delays remediation and extends the impact on workloads running on the degraded node.

      Predictable scale-down behavior: Currently, scaling down a NodePool removes a node based on alphabetical Machine name ordering (when all machines have equal priority). This is non-intuitive and can result in a healthy, heavily-utilized node being removed while a degraded or empty node remains. Customers expect that standard operational signals like cordoning or tainting a node would influence which node gets removed.
      Alignment with Kubernetes operational practices: Cordoning and tainting are standard Kubernetes practices for marking nodes for decommissioning. Customers moving from self-managed OpenShift to HCP expect these familiar workflows to carry over. The inability to influence node deletion from the guest cluster is a gap in the HCP operational experience.

      Reduced risk of workload disruption: Without targeted deletion, a scale-down may remove a node running critical workloads while leaving an empty or degraded node in the cluster. Allowing guest cluster admins to direct deletion reduces unnecessary pod evictions and scheduling churn.

      4. List any affected packages or components.

      ClusterAPI

       

       

       

       

       

       

              racedoro@redhat.com Ramon Acedo
              rhn-support-dpateriy Divyam Pateriya
              None
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                None
                None