-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
4.11
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
Important
-
None
-
None
-
None
-
Rejected
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Version:
./openshift-install 4.11.0-0.nightly-2022-04-24-135651
built from commit 9cf0c5a963bf983ccf997fed46e7bcde81a02569
release image registry.ci.openshift.org/ocp/release@sha256:3cfd57e4c7cff0807b7811a3a885b336955e1f7b4c646b17975307c350830879
release architecture amd64
Platform: alibabacloud
Please specify: IPI
What happened?
Destroying a working cluster doesn't delete all resources of the cluster, e.g. 1 or 2 compute nodes, security groups, load balancers, NAT gateway & EIP, and VPC.
What did you expect to happen?
Destroying a working cluster in any region should delete all resources of the cluster.
How to reproduce it (as minimally and precisely as possible)?
It seems always, so far we'd tried with regions "us-east-1", "eu-west-1", "eu-central-1" and all have the issue. Besides, we guess the issue had led to VPC/SLB quota used up for running prow CI jobs.
Anything else we need to know?
Initially we met the issue when debugging prow-ci jobs (see PR https://github.com/openshift/release/pull/28083), then we tried with QE flexy jobs and noticed the same issue, although not always (e.g. no such issue with region "ap-northeast-1").
The QE flexy jobs and log snippets:
https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-install/97010/ (region: us-east-1)
https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-destroy/87853/
>log snippet of the destroy:
04-25 13:18:34.763 level=debug msg=OpenShift Installer 4.11.0-0.nightly-2022-04-24-135651
04-25 13:18:34.763 level=debug msg=Built from commit 9cf0c5a963bf983ccf997fed46e7bcde81a02569
04-25 13:18:34.763 level=debug msg=Retrieving cloud resources tags=
04-25 13:18:36.645 level=debug msg=Retrieving cloud resources tags=
{"ack.aliyun.com":"jiwei-0425-02-5g5cb"}04-25 13:18:37.592 level=debug msg=Searching RAM policy policyName=jiwei-0425-02-5g5cb-policy-bootstrap stage=RAM roles
04-25 13:18:37.592 level=debug msg=Searching OSS bucket bucketName=jiwei-0425-02-5g5cb-image-registry-us-east-1-fsgpawgrijcevgxpq stage=OSS buckets
04-25 13:18:37.592 level=debug msg=Searching DNS records stage=DNS records
04-25 13:18:38.515 level=debug msg=Unbinding tags for OSS bucket bucketName=jiwei-0425-02-5g5cb-image-registry-us-east-1-fsgpawgrijcevgxpq stage=OSS buckets tags=[kubernetes.io/cluster/jiwei-0425-02-5g5cb]
>04-25 13:18:39.104 level=debug msg=Deleting ECS instances ecsIDs=[i-0xi0csfz5bpct1o0py3g i-0xi0csfz5bpct1o0py3h i-0xifdsmun8b9v22aojmi i-0xi0csfz5bpct9k566fw i-0xi0csfz5bpctbj6a8gl] stage=ECS instances
04-25 13:18:39.359 level=debug msg=Deleting policyName=jiwei-0425-02-5g5cb-policy-bootstrap stage=RAM roles
04-25 13:18:40.316 level=debug msg=Searching OSS bucket objects bucketName=jiwei-0425-02-5g5cb-image-registry-us-east-1-fsgpawgrijcevgxpq stage=OSS buckets
04-25 13:18:40.316 level=debug msg=Deleting OSS bucket bucketName=jiwei-0425-02-5g5cb-image-registry-us-east-1-fsgpawgrijcevgxpq stage=OSS buckets
04-25 13:18:40.874 level=debug msg=Deleting domain=alicloud-qe.devcluster.openshift.com recordID=759221396576844800 rr=*.apps.jiwei-0425-02 stage=DNS records
04-25 13:18:41.128 level=debug msg=Deleting domain=alicloud-qe.devcluster.openshift.com recordID=759219941511925760 rr=api.jiwei-0425-02 stage=DNS records
04-25 13:18:41.413 level=debug msg=Deleting roleName=jiwei-0425-02-5g5cb-role-bootstrap stage=RAM roles
04-25 13:18:41.971 level=debug msg=Searching RAM policy policyName=jiwei-0425-02-5g5cb-policy-master stage=RAM roles
04-25 13:18:42.227 level=debug msg=Detaching policy for RAM role policyName=jiwei-0425-02-5g5cb-policy-master principalName=jiwei-0425-02-5g5cb-role-master@role.5724326381648897.onaliyunservice.com stage=RAM roles
04-25 13:18:43.184 level=debug msg=Public DNS records deleted stage=DNS records
04-25 13:18:44.125 level=debug msg=Policy detached policyName=jiwei-0425-02-5g5cb-policy-master stage=RAM roles
04-25 13:18:44.126 level=debug msg=Deleting policyName=jiwei-0425-02-5g5cb-policy-master stage=RAM roles
04-25 13:18:46.076 level=info msg=OSS bucket deleted bucketName=jiwei-0425-02-5g5cb-image-registry-us-east-1-fsgpawgrijcevgxpq stage=OSS buckets
04-25 13:18:46.076 level=info msg=OSS buckets deleted stage=OSS buckets
04-25 13:18:46.076 level=info msg=ECS instances deleted stage=ECS instances
04-25 13:18:46.076 level=debug msg=Deleting roleName=jiwei-0425-02-5g5cb-role-master stage=RAM roles
04-25 13:18:46.332 level=debug msg=Searching RAM policy policyName=jiwei-0425-02-5g5cb-policy-worker stage=RAM roles
04-25 13:18:46.587 level=debug msg=Detaching policy for RAM role policyName=jiwei-0425-02-5g5cb-policy-worker principalName=jiwei-0425-02-5g5cb-role-worker@role.5724326381648897.onaliyunservice.com stage=RAM roles
04-25 13:18:48.504 level=debug msg=Policy detached policyName=jiwei-0425-02-5g5cb-policy-worker stage=RAM roles
04-25 13:18:48.504 level=debug msg=Deleting policyName=jiwei-0425-02-5g5cb-policy-worker stage=RAM roles
04-25 13:18:50.457 level=debug msg=Deleting roleName=jiwei-0425-02-5g5cb-role-worker stage=RAM roles
04-25 13:18:50.711 level=info msg=RAM roles deleted stage=RAM roles
04-25 13:18:50.711 level=debug msg=Searching private zone clusterDomain=jiwei-0425-02.alicloud-qe.devcluster.openshift.com stage=private zones
04-25 13:18:50.965 level=debug msg=Unbinding private zone with vpc stage=private zones zoneID=123f858b8ae4176c562c03846cccae3b
04-25 13:18:52.891 level=debug msg=Deleting private zone stage=private zones zoneID=123f858b8ae4176c562c03846cccae3b
04-25 13:18:55.427 level=info msg=Private zones deleted stage=private zones
04-25 13:18:55.427 level=debug msg=Searching resource groups name=jiwei-0425-02-5g5cb-rg stage=resource groups
04-25 13:18:55.681 level=debug msg=Purging asset "Metadata" from disk
04-25 13:18:55.681 level=debug msg=Purging asset "Master Ignition Customization Check" from disk
04-25 13:18:55.681 level=debug msg=Purging asset "Worker Ignition Customization Check" from disk
04-25 13:18:55.681 level=debug msg=Purging asset "Terraform Variables" from disk
04-25 13:18:55.936 level=debug msg=Purging asset "Kubeconfig Admin Client" from disk
04-25 13:18:55.936 level=debug msg=Purging asset "Kubeadmin Password" from disk
04-25 13:18:55.936 level=debug msg=Purging asset "Certificate (journal-gatewayd)" from disk
04-25 13:18:55.936 level=debug msg=Purging asset "Cluster" from disk
04-25 13:18:55.936 level=info msg=Time elapsed: 21s
>remaining resources after destroying the cluster:
$ aliyun resourcemanager ListResources --endpoint "resourcemanager.aliyuncs.com" --Region "us-east-1" --ResourceGroupId "rg-aek2aognijpinoy" --output cols=CreateDate,RegionId,ResourceType,Service,ResourceId rows=Resources.Resource[]
CreateDate | RegionId | ResourceType | Service | ResourceId
---------- | -------- | ------------ | ------- | ----------
2022-04-25T12:15:49+08:00 | us-east-1 | disk | ecs | d-0xide7x8bi2pz2vblvt9
2022-04-25T12:15:49+08:00 | us-east-1 | eni | ecs | eni-0xifdsmun8b9v9y9gfcn
2022-04-25T12:15:49+08:00 | us-east-1 | instance | ecs | i-0xi13jzw8m86er6hirui
2022-04-25T12:00:39+08:00 | us-east-1 | securitygroup | ecs | sg-0xi0csfz5bpct1nzpsh9
2022-04-25T12:00:39+08:00 | us-east-1 | securitygroup | ecs | sg-0xi76zpfhbtwaewix3qz
2022-04-25T12:00:33+08:00 | us-east-1 | eip | eip | eip-0xinty4pdfss7cb6qf2tf
2022-04-25T12:00:39+08:00 | us-east-1 | loadbalancer | slb | lb-7go55ddlrdycbuz9ha3gv
2022-04-25T12:01:01+08:00 | us-east-1 | loadbalancer | slb | lb-7gockfv41r0erza1ugg2j
2022-04-25T12:00:57+08:00 | us-east-1 | natgateway | vpc | ngw-0xi3tiswq7y3r9vc9rdg0
2022-04-25T12:00:32+08:00 | us-east-1 | vpc | vpc | vpc-0xi83ys8ywf6igy4gucwo
$
https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-install/97011/ (region: us-east-1)
https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-destroy/87854/
>log snippet of the destroy:
04-25 13:18:48.314 level=debug msg=Deleting ECS instances ecsIDs=[i-0xi2s0tw2fofwinizdgo i-0xi2s0tw2fofwinizdgn i-0xide7x8bi2pywyej7he i-0xide7x8bi2pz2vhvdrg i-0xide7x8bi2pz2vhvdth] stage=ECS instances
>remaining resources after destroying the cluster:
$ aliyun resourcemanager ListResources --endpoint "resourcemanager.aliyuncs.com" --Region "us-east-1" --ResourceGroupId "rg-aek2wky7lxk4f5y" --output cols=CreateDate,RegionId,ResourceType,Service,ResourceId rows=Resources.Resource[]
CreateDate | RegionId | ResourceType | Service | ResourceId
---------- | -------- | ------------ | ------- | ----------
2022-04-25T12:16:46+08:00 | us-east-1 | disk | ecs | d-0xide7x8bi2pz2vblvv7
2022-04-25T12:16:46+08:00 | us-east-1 | eni | ecs | eni-0xifdsmun8b9v9y9gfda
2022-04-25T12:16:46+08:00 | us-east-1 | instance | ecs | i-0xifdsmun8b9v9yf4ry9
2022-04-25T12:01:50+08:00 | us-east-1 | securitygroup | ecs | sg-0xifdsmun8b9v414vl63
2022-04-25T12:01:50+08:00 | us-east-1 | securitygroup | ecs | sg-0xide7x8bi2pywye08t4
2022-04-25T12:01:44+08:00 | us-east-1 | eip | eip | eip-0xi4fcumv1uy8dxrqnfe5
2022-04-25T12:01:47+08:00 | us-east-1 | loadbalancer | slb | lb-7go6weruo4xbtp4bs0s80
2022-04-25T12:02:13+08:00 | us-east-1 | loadbalancer | slb | lb-7godzg98qanoq811ananq
2022-04-25T12:02:09+08:00 | us-east-1 | natgateway | vpc | ngw-0xir87ma3xnp9p0cqfd77
2022-04-25T12:01:44+08:00 | us-east-1 | vpc | vpc | vpc-0xi29vd1j95kwnuh9eo3x
$
https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-install/97043/ (region: eu-west-1)
https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-destroy/87861/
>log snippet of the destroy:
04-25 14:13:55.912 level=debug msg=Deleting ECS instances ecsIDs=[i-d7ofus77g1ud0rk5geio i-d7oh1c1ktewql2jvjao1 i-d7ob4zfxlfuhw351arfh i-d7oh1c1ktewql8gyvgls i-d7ob4zfxlfuhw924mxdb] stage=ECS instances
>remaining resources after destroying the cluster:
$ aliyun resourcemanager ListResources --endpoint "resourcemanager.aliyuncs.com" --Region "eu-west-1" --ResourceGroupId "rg-aek2wky7lxk4f5y" --PageSize 30 --output cols=CreateDate,RegionId,ResourceType,Service,ResourceId rows=Resources.Resource[]
CreateDate | RegionId | ResourceType | Service | ResourceId
---------- | -------- | ------------ | ------- | ----------
2022-04-25T13:50:11+08:00 | eu-west-1 | disk | ecs | d-d7oh1c1ktewql8gyq9ee
2022-04-25T13:50:11+08:00 | eu-west-1 | eni | ecs | eni-d7ob4zfxlfuhw9200l3m
2022-04-25T13:50:11+08:00 | eu-west-1 | instance | ecs | i-d7ofus77g1ud0xh8skgb
2022-04-25T13:37:40+08:00 | eu-west-1 | securitygroup | ecs | sg-d7oh1c1ktewql2jqmq48
2022-04-25T13:37:40+08:00 | eu-west-1 | securitygroup | ecs | sg-d7oh1c1ktewql2jqmq49
2022-04-25T13:37:35+08:00 | eu-west-1 | vpc | vpc | vpc-d7ow0w4vozgqccdo110jz
$
https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-install/97042/ (region: eu-central-1)
https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-destroy/87863/
>log snippet of the destroy:
04-25 14:22:26.165 level=debug msg=Deleting ECS instances ecsIDs=[i-gw8848gi5x3s3eiz2c58 i-gw8fvvej3dibrwfcx8r1 i-gw8fvvej3dibrwfcx8r2 i-gw8848gi5x3s3mf3ik22] stage=ECS instances
>remaining resources after destroying the cluster:
$ aliyun resourcemanager ListResources --endpoint "resourcemanager.aliyuncs.com" --Region "eu-central-1" --ResourceGroupId "rg-aek2aognijpinoy" --PageSize 30 --output cols=CreateDate,RegionId,ResourceType,Service,ResourceId rows=Resources.Resource[]
CreateDate | RegionId | ResourceType | Service | ResourceId
---------- | -------- | ------------ | ------- | ----------
2022-04-25T13:54:43+08:00 | eu-central-1 | disk | ecs | d-gw8glsylvylkx7kg1n87
2022-04-25T13:55:53+08:00 | eu-central-1 | disk | ecs | d-gw8ed92ta5rd4igmv3o8
2022-04-25T13:54:43+08:00 | eu-central-1 | eni | ecs | eni-gw8ed92ta5rd4igjrpsf
2022-04-25T13:55:53+08:00 | eu-central-1 | eni | ecs | eni-gw8848gi5x3s3mf3iffv
2022-04-25T13:54:43+08:00 | eu-central-1 | instance | ecs | i-gw8ed92ta5rd4igmbmrt
2022-04-25T13:55:53+08:00 | eu-central-1 | instance | ecs | i-gw8glsylvylkx7khc6cy
2022-04-25T13:37:30+08:00 | eu-central-1 | securitygroup | ecs | sg-gw8glsylvylkwzoa55ma
2022-04-25T13:37:27+08:00 | eu-central-1 | eip | eip | eip-gw8su7cs7bleui26vwmga
2022-04-25T13:37:26+08:00 | eu-central-1 | vpc | vpc | vpc-gw8nyge09ok2apd5avsmj
$