Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-63452

Control plane deployments lack finalizers, risking orphaned cloud resources on accidental deletion

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • 4.20.z
    • HyperShift
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Moderate
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      Control plane deployments created by the control-plane-operator (including capi-provider, cluster-api, and cloud-controller-manager-*) do not have finalizers to protect against accidental deletion. These
      deployments manage CAPI resources (MachineDeployment, Machine, platform-specific machines) that DO have finalizers requiring their controllers to process cleanup operations.

      If a deployment like capi-provider is accidentally deleted before HostedCluster deletion:

      1. The deployment is deleted immediately (no finalizer protection)
      2. The CAPI provider controller stops running
      3. During HostedCluster deletion, CAPI resources are marked for deletion
      4. CAPI resource finalizers cannot be processed (controller is gone)
      5. Cloud resources (EC2 instances, VMs, disks, NICs, load balancers) are orphaned
      6. CAPI resources stuck in Terminating state indefinitely

      This affects all CAPI-based platforms: AWS, Azure, GCP, OpenStack, KubeVirt, PowerVS, Agent.

      Code references:

      • No finalizers on deployments: support/controlplane-component/builder.go (NewDeploymentComponent)
      • CAPI resources have finalizers: vendor/sigs.k8s.io/cluster-api/api/v1beta1/machinedeployment_types.go:30
      • Platform machine finalizers: vendor/sigs.k8s.io/cluster-api-provider-aws/v2/api/v1beta2/awsmachine_types.go:27
      • Deletion flow: hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go:3185
      • Compare with NodePool which correctly uses finalizer: hypershift-operator/controllers/nodepool/nodepool_controller.go:54

      Version-Release number of selected component (if applicable):
      All HyperShift versions

      How reproducible:
      Always - any accidental deletion of capi-provider or cluster-api deployment

      Steps to Reproduce:

      1. Create HostedCluster with NodePool on any CAPI platform (AWS, Azure, etc.)
      2. Wait for MachineDeployment and cloud instances to be created
      3. Delete capi-provider deployment: oc delete deployment capi-provider -n <control-plane-namespace>
      4. Delete the HostedCluster: oc delete hostedcluster <name>

      Actual results:

      • Deployment deletes immediately without finalizer protection
      • CAPI resources (MachineDeployment, Machine, AWSMachine) stuck in Terminating state
      • Cloud resources (EC2 instances, VMs, disks) orphaned and continue running
      • Manual cleanup required via cloud provider console
      • Potential cost implications from orphaned resources

      Expected results:

      • Critical deployments should have finalizers to prevent accidental deletion
      • If deployment is marked for deletion, it should wait for dependent resources to clean up
      • Cloud resources should be properly deleted when HostedCluster is deleted
      • No orphaned cloud infrastructure

      Additional info:

      Affected deployments (confirmed via code search):

      Critical:

      • capi-provider (manages platform machines: AWSMachine, AzureMachine, etc.)
      • cluster-api (manages MachineDeployment, MachineSet, Machine)

      Also potentially affected:

      • cloud-controller-manager-* (AWS, Azure, OpenStack, KubeVirt, PowerVS)
      • autoscaler
      • karpenter/karpenter-operator

              Unassigned Unassigned
              asegurap1@redhat.com Antoni Segura Puimedon
              None
              None
              Yu Li Yu Li
              None
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: