Loading...

XML

Word

Printable

Type: Epic
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:

Epic Name:
Installer-like approach to hosted cluster teardown
Blocked:
False
Blocked Reason:
None
Ready:
False
Color Status:
Not Selected
Epic Status:
To Do
Feature Link:
OCPSTRAT-1715 - Control Plane Operator Direct cloud resource cleanup
Parent Link:
OCPSTRAT-1715Control Plane Operator Direct cloud resource cleanup
Hierarchy Progress Bar:

100% To Do, 0% In Progress, 0% Done

Sprint:
Hypershift Sprint 253, Hypershift Sprint 254, Hypershift Sprint 255, Hypershift Sprint 256, Hypershift Sprint 257
Cost of Delay:
0
WSJF:
0
Risk Score:
0

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Intelligence Requested:
Market:

User Story:

As a Hosted Cluster admin, I want to be able to:

Delete hosted clusters in the minimum time

so that I can achieve

Minimum cloud resource consumption

Service provider achieves

Better UX
Less computation related to deleting resources

Acceptance Criteria:

Description of criteria:

HyperShift directly manages resource deletion
Resource deletion failure alert the customer (specially important for billable items)

Out of Scope:

Cloud resource deletion throttling detection

Engineering Details:

Currently cloud resource cleanup is delegated to operators that run in the hosted control plane (registry operator cleans up its bucket, ingress operator removes additional dns entries, cloud controller manager removes load balancers and persistent volumes, etc). The benefit with this approach is that we don't need cloud-specific code in the CPO to destroy resources. The drawback is that this cleanup can sometimes take a long time and depends on the hosted cluster's API server to be in a healthy state.
A different approach which could make this process faster is to directly destroy resources in a similar way to `openshift-installer destroy cluster` or even `hypershift destroy cluster infra`. Instead of waiting for controllers to do the right thing, we can directly destroy resources. This would make it more straightforward and likely much faster.
One consideration with this approach is that unlike the CLI tools, the CPO doesn't have a single role that can destroy all resources. We would have to access AWS with different operator roles to destroy the different types of resources. This can be done via API calls similar to what the token-minter command makes to obtain tokens for the different service accounts.

incorporates

OCPBUGS-32289 Possible leak of ELB in HCP cluster

OCPBUGS-29698 Stopping instances results in machines stuck deleting

Closed

is duplicated by

HOSTEDCP-1454 Installer-like deprovisioning approach

Closed

is related to

RFE-5385 Clusters stuck in uninstalling should be able to get unstuck by bugfixes

Approved

links to

openshift/hypershift#3975: HOSTEDCP-1402: cmd/infra/aws/destroy: allow using component credentials

Assignee:: Unassigned

Reporter:: Cesar Wong

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2024/01/24 8:02 PM

Updated:: 2024/11/07 11:16 PM

Details

Description

User Story:

Acceptance Criteria:

Out of Scope:

Engineering Details:

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates