Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Normal
Fix Version/s: None
Affects Version/s: 4.20.z
Component/s: HyperShift
Labels:

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
Moderate
Regression:
None

Target Backport Versions:

4.20.z
Target Version:

4.21.0
Release Blocker:
None
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

Control plane deployments created by the control-plane-operator (including capi-provider, cluster-api, and cloud-controller-manager-*) do not have finalizers to protect against accidental deletion. These
deployments manage CAPI resources (MachineDeployment, Machine, platform-specific machines) that DO have finalizers requiring their controllers to process cleanup operations.

If a deployment like capi-provider is accidentally deleted before HostedCluster deletion:

The deployment is deleted immediately (no finalizer protection)
The CAPI provider controller stops running
During HostedCluster deletion, CAPI resources are marked for deletion
CAPI resource finalizers cannot be processed (controller is gone)
Cloud resources (EC2 instances, VMs, disks, NICs, load balancers) are orphaned
CAPI resources stuck in Terminating state indefinitely

This affects all CAPI-based platforms: AWS, Azure, GCP, OpenStack, KubeVirt, PowerVS, Agent.

Code references:

No finalizers on deployments: support/controlplane-component/builder.go (NewDeploymentComponent)
CAPI resources have finalizers: vendor/sigs.k8s.io/cluster-api/api/v1beta1/machinedeployment_types.go:30
Platform machine finalizers: vendor/sigs.k8s.io/cluster-api-provider-aws/v2/api/v1beta2/awsmachine_types.go:27
Deletion flow: hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go:3185
Compare with NodePool which correctly uses finalizer: hypershift-operator/controllers/nodepool/nodepool_controller.go:54

Version-Release number of selected component (if applicable):
All HyperShift versions

How reproducible:
Always - any accidental deletion of capi-provider or cluster-api deployment

Steps to Reproduce:

Create HostedCluster with NodePool on any CAPI platform (AWS, Azure, etc.)
Wait for MachineDeployment and cloud instances to be created
Delete capi-provider deployment: oc delete deployment capi-provider -n <control-plane-namespace>
Delete the HostedCluster: oc delete hostedcluster <name>

Actual results:

Deployment deletes immediately without finalizer protection
CAPI resources (MachineDeployment, Machine, AWSMachine) stuck in Terminating state
Cloud resources (EC2 instances, VMs, disks) orphaned and continue running
Manual cleanup required via cloud provider console
Potential cost implications from orphaned resources

Expected results:

Critical deployments should have finalizers to prevent accidental deletion
If deployment is marked for deletion, it should wait for dependent resources to clean up
Cloud resources should be properly deleted when HostedCluster is deleted
No orphaned cloud infrastructure

Additional info:

Affected deployments (confirmed via code search):

Critical:

capi-provider (manages platform machines: AWSMachine, AzureMachine, etc.)
cluster-api (manages MachineDeployment, MachineSet, Machine)

Also potentially affected:

cloud-controller-manager-* (AWS, Azure, OpenStack, KubeVirt, PowerVS)
autoscaler
karpenter/karpenter-operator

links to

openshift/hypershift#7258: OCPBUGS-63452: Add finalizer protection to control plane workloads

Assignee:: Borja Clemente Castanera

Reporter:: Antoni Segura Puimedon

Need Info From:: None

Contributors:: None

QA Contact:: Lin Gao

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Created:: 2025/10/23 8:44 AM

Updated:: 2025/11/18 4:31 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates