Uploaded image for project: 'OpenShift Hosted Control Plane'
  1. OpenShift Hosted Control Plane
  2. HOSTEDCP-597

Cluster deletion hangs because of awsendpointService

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Done
    • Icon: Critical Critical
    • None
    • None
    • None
    • Hypershift Sprint 18, Hypershift Sprint 19
    • 0
    • 0
    • 0

      Reproduce

      Login to fleet managed clusters https://docs.google.com/document/d/1j2RzOfdLOFviKrboIREGkgU7Vbw8Y8jv3h_VrdjVX5A/edit#

      The conflicting cluster is kubectl get hc -nocm-staging-1v8vhn06etvacjcrojejlnuap128lqhv

      The awsendpoint service nevers go away, because the   - hypershift.openshift.io/hypershift-operator-finalizer is not removed.

      kubectl logs --tail=50  operator-865b7db4c4-xdj95 -nhypershift | grep lponce15

      {"level":"info","ts":"2022-10-17T10:24:10Z","msg":"reconciling","controller":"hostedcluster","controllerGroup":"hypershift.openshift.io","controllerKind":"HostedCluster","hostedCluster":
      {"name":"lponce15","namespace":"ocm-staging-1v8vhn06etvacjcrojejlnuap128lqhv"}
      ,"namespace":"ocm-staging-1v8vhn06etvacjcrojejlnuap128lqhv","name":"lponce15","reconcileID":"b41e2aa0-525e-4fbb-8e00-53f296812e4c"}
      {"level":"info","ts":"2022-10-17T10:24:10Z","msg":"Waiting for awsendpointservice deletion","controller":"hostedcluster","controllerGroup":"hypershift.openshift.io","controllerKind":"HostedCluster","hostedCluster":
      {"name":"lponce15","namespace":"ocm-staging-1v8vhn06etvacjcrojejlnuap128lqhv"}
      ,"namespace":"ocm-staging-1v8vhn06etvacjcrojejlnuap128lqhv","name":"lponce15","reconcileID":"b41e2aa0-525e-4fbb-8e00-53f296812e4c","controlPlaneNamespace":"ocm-staging-1v8vhn06etvacjcrojejlnuap128lqhv-lponce15"}
      {"level":"info","ts":"2022-10-17T10:24:10Z","msg":"hostedcluster is still deleting","controller":"hostedcluster","controllerGroup":"hypershift.openshift.io","controllerKind":"HostedCluster","hostedCluster":
      {"name":"lponce15","namespace":"ocm-staging-1v8vhn06etvacjcrojejlnuap128lqhv"}
      ,"namespace":"ocm-staging-1v8vhn06etvacjcrojejlnuap128lqhv","name":"lponce15","reconcileID":"b41e2aa0-525e-4fbb-8e00-53f296812e4c","name":{"namespace":"ocm-staging-1v8vhn06etvacjcrojejlnuap128lqhv","name":"lponce15"}}
      

      By looking at the logs seems AWSEndpointServiceReconciler is failing to delete

      {"level":"error","ts":"2022-10-18T12:00:33Z","msg":"Reconciler error","controller":"awsendpointservice","controllerGroup":"hypershift.openshift.io","controllerKind":"AWSEndpointService","aWSEndpointService":
      {"name":"private-router","namespace":"ocm-staging-1v8vhn06etvacjcrojejlnuap128lqhv-lponce15"}
      ,"namespace":"ocm-staging-1v8vhn06etvacjcrojejlnuap128lqhv-lponce15","name":"private-router","reconcileID":"d76b8ab6-4be9-452c-b140-957fd20a6a35","error":"failed to delete resource: Service has existing active VPC Endpoint connections!","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:234"}
      

      This should have run and succeed https://github.com/openshift/hypershift/blob/main/hypershift-operator/controllers/platform/aws/controller.go#L493-L529 

            agarcial@redhat.com Alberto Garcia Lamela
            agarcial@redhat.com Alberto Garcia Lamela
            Jie Zhao Jie Zhao
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: