-
Bug
-
Resolution: Unresolved
-
Critical
-
4.16
-
Critical
-
No
-
Rejected
-
False
-
-
Release Note Not Required
-
In Progress
Description of problem:
If the installer using cluster api exits before bootstrap destroy, it may leak processes which continue to run in the background of the host system. These processes may continue to reconcile cloud resources, so the cluster resources would be created and recreated even when you are trying to delete them. This occurs because the installer runs kube-apiserver, etcd, and the capi provider binaries as subprocesses. If the installer exits without shutting down those subprocesses, due to an error or user interrupt, the processes will continue to run in the background. The processes can be identified with the ps command. pgrep and pkill are also useful. Brief discussion here of this occurring in PowerVS: https://redhat-internal.slack.com/archives/C05QFJN2BQW/p1712688922574429
Version-Release number of selected component (if applicable):
How reproducible:
Often
Steps to Reproduce:
1. Run capi-based install (on any platform), by specifying fields below in the install config [0] 2. Wait until CAPI controllers begin to run. This will be easy to identify because the terminal will fill with controller logs. Particularly you should see [1] 3. Once the controllers are running interrupt with CTRL + C [0] Install config for capi install featureGates: - ClusterAPIInstall=true featureSet: CustomNoUpgrade [1] INFO Started local control plane with envtest INFO Stored kubeconfig for envtest in: /c/auth/envtest.kubeconfig INFO Running process: Cluster API with args [-v=2 --metrics-bind-addr=0 --
Actual results:
controllers will leak and continue to run. They can be viewed with ps or pgrep You may also see INFO Shutting down local Cluster API control plane... That means the Shutdown started but did not complete.
Expected results:
The installer should shutdown gracefully and not leak processes, such as: ^CWARNING Received interrupt signal INFO Shutting down local Cluster API control plane... INFO Stopped controller: Cluster API INFO Stopped controller: aws infrastructure provider ERROR failed to fetch Cluster: failed to generate asset "Cluster": failed to create cluster: failed to create infrastructure manifest: Post "https://127.0.0.1:41441/apis/infrastructure.cluster.x-k8s.io/v1beta2/awsclustercontrolleridentities": unexpected EOF INFO Local Cluster API system has completed operations
Additional info: