Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-49868

Agent deprovisioning problems are not surfaced to higher level cluster resources

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • 4.17
    • HyperShift / Agent
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      When destroying an agent cluster, problems deprovisioning the agents can prevent the cluster from deleting. For example, today we encountered an issue in which the DNS records for a cluster were deleted at the same time as the cluster, leading to this situation:
      
      $ k -n hardware-inventory get agent 2f25a998-0f1d-c202-4fdd-a2c300c9b7da -o json | jq .status.deprovision_info
      {
        "cluster_name": "london",
        "cluster_namespace": "clusters-london",
        "message": "failed to get API group resources: unable to retrieve the complete list of server APIs: v1: Get \"https://api.london.int.massopen.cloud:30894/api/v1\": dial tcp: lookup api.london.int.massopen.cloud on 172.30.0.10:53: no such host",
        "node_name": "moc-r4pac24u35-s1c"
      }
      
      This information isn't particularly discoverable without a fair amount of a priori knowledge about hosted control planes. In particular:
      
      - There is no indication of a problem in the output of `kubectl get agents`.
      - There is no indication of a problem in the .status attribute of either the HostedCluster resource or the HostedControlPlane resource.

      Version-Release number of selected component (if applicable):

      We're running ACM 2.12.2

      How reproducible:

      Delete the DNS records for a cluster, and then attempt to delete the cluster.

      Actual results:

          

      Expected results:

      Issues preventing a cluster from deleting should be surfaced to the cluster administrator in a more obvious fashion.

      Additional info:

          

              cchun@redhat.com Crystal Chun
              lkellogg@redhat.com Lars Kellogg-Stedman
              None
              None
              Liangquan Li Liangquan Li
              None
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: