-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
4.17
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
When destroying an agent cluster, problems deprovisioning the agents can prevent the cluster from deleting. For example, today we encountered an issue in which the DNS records for a cluster were deleted at the same time as the cluster, leading to this situation: $ k -n hardware-inventory get agent 2f25a998-0f1d-c202-4fdd-a2c300c9b7da -o json | jq .status.deprovision_info { "cluster_name": "london", "cluster_namespace": "clusters-london", "message": "failed to get API group resources: unable to retrieve the complete list of server APIs: v1: Get \"https://api.london.int.massopen.cloud:30894/api/v1\": dial tcp: lookup api.london.int.massopen.cloud on 172.30.0.10:53: no such host", "node_name": "moc-r4pac24u35-s1c" } This information isn't particularly discoverable without a fair amount of a priori knowledge about hosted control planes. In particular: - There is no indication of a problem in the output of `kubectl get agents`. - There is no indication of a problem in the .status attribute of either the HostedCluster resource or the HostedControlPlane resource.
Version-Release number of selected component (if applicable):
We're running ACM 2.12.2
How reproducible:
Delete the DNS records for a cluster, and then attempt to delete the cluster.
Actual results:
Expected results:
Issues preventing a cluster from deleting should be surfaced to the cluster administrator in a more obvious fashion.
Additional info: