Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Undefined
Fix Version/s: None
Affects Version/s: 4.17
Component/s: HyperShift / Agent
Labels:
None

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

When destroying an agent cluster, problems deprovisioning the agents can prevent the cluster from deleting. For example, today we encountered an issue in which the DNS records for a cluster were deleted at the same time as the cluster, leading to this situation:

$ k -n hardware-inventory get agent 2f25a998-0f1d-c202-4fdd-a2c300c9b7da -o json | jq .status.deprovision_info
{
  "cluster_name": "london",
  "cluster_namespace": "clusters-london",
  "message": "failed to get API group resources: unable to retrieve the complete list of server APIs: v1: Get \"https://api.london.int.massopen.cloud:30894/api/v1\": dial tcp: lookup api.london.int.massopen.cloud on 172.30.0.10:53: no such host",
  "node_name": "moc-r4pac24u35-s1c"
}

This information isn't particularly discoverable without a fair amount of a priori knowledge about hosted control planes. In particular:

- There is no indication of a problem in the output of `kubectl get agents`.
- There is no indication of a problem in the .status attribute of either the HostedCluster resource or the HostedControlPlane resource.

Version-Release number of selected component (if applicable):

We're running ACM 2.12.2

How reproducible:

Delete the DNS records for a cluster, and then attempt to delete the cluster.

Actual results:

Expected results:

Issues preventing a cluster from deleting should be surfaced to the cluster administrator in a more obvious fashion.

Additional info:

Assignee:: Crystal Chun

Reporter:: Lars Kellogg-Stedman

QA Contact:: Zheng Feng

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2025/02/04 11:20 PM

Updated:: 2026/01/30 2:36 AM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates