-
Bug
-
Resolution: Won't Do
-
Normal
-
None
-
4.12
-
Moderate
-
None
-
False
-
Description of problem:
The IBM Storage driver fails to run on nodes that do not have the proper labels.
https://github.com/openshift/ibm-vpc-block-csi-driver
The IBM VPC Node label updater is responsible for adding one such label, which is missing
https://github.com/openshift/ibm-vpc-node-label-updater
This causes the storage service to not function resulting in a failed cluster creation.
Version-Release number of selected component (if applicable):
4.12
How reproducible:
Infrequent (unknown for sure at this time)
Steps to Reproduce:
1. Create a IPI cluster on IBM Cloud
Actual results:
Successful cluster creation
Expected results:
Failed cluster creation, waiting for storage operator to report healthy
level=error msg=Cluster operator storage Available is False with IBMVPCBlockCSIDriverOperatorCR_IBMBlockDriverControllerServiceController_Deploying: IBMVPCBlockCSIDriverOperatorCRAvailable: IBMBlockDriverControllerServiceControllerAvailable: Waiting for Deployment
Additional info:
I have seen this occur twice, recently, but only have details for one such failure as part of CI testing. Those details can be found for the Prow build
https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_installer/6056/pull-ci-openshift-installer-master-e2e-ibmcloud/1562148413049409536
Primarily, the controller pod was failing (vpc-block-driver container) with
https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_installer/6056/pull-ci-openshift-installer-master-e2e-ibmcloud/1562148413049409536/artifacts/e2e-ibmcloud/gather-extra/artifacts/pods/openshift-cluster-csi-drivers_ibm-vpc-block-csi-controller-6dc5f55d87-47vxh_iks-vpc-block-driver.log
{"level":"fatal","timestamp":"2022-08-23T21:34:07.969Z","caller":"cmd/main.go:110","msg":"Failed to initialize driver...","name":"ibm-vpc-block-csi-driver","CSIDriverName":"IBM VPC block driver","error":"Controller_Helper: Failed to initialize node metadata: error: One or few required node label(s) is/are missing [ibm-cloud.kubernetes.io/worker-id, failure-domain.beta.kubernetes.io/region, failure-domain.beta.kubernetes.io/zone]. Node Labels Found = [#map[beta.kubernetes.io/arch:amd64 beta.kubernetes.io/instance-type:bx2-4x16 beta.kubernetes.io/os:linux failure-domain.beta.kubernetes.io/region:eu-gb failure-domain.beta.kubernetes.io/zone:eu-gb-3 kubernetes.io/arch:amd64 kubernetes.io/hostname:ci-op-d2gzpmty-74899-zg8t6-worker-3-ktlcd kubernetes.io/os:linux node-role.kubernetes.io/worker: node.kubernetes.io/instance-type:bx2-4x16 node.openshift.io/os_id:rhcos topology.kubernetes.io/region:eu-gb topology.kubernetes.io/zone:eu-gb-3]]"}
For which the the "ibmcloud.kubernetes.io/worker-id" label was missing, which is added by the ibm-vpc-node-label-updater
https://github.com/openshift/ibm-vpc-node-label-updater/blob/64c1820764f8a7065b03b08a70673b8c125876c1/pkg/nodeupdater/node_label.go#L49
Which was failing due to missing credentials
https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_installer/6056/pull-ci-openshift-installer-master-e2e-ibmcloud/1562148413049409536/artifacts/e2e-ibmcloud/gather-extra/artifacts/pods/openshift-cluster-csi-drivers_ibm-vpc-block-csi-node-cg5bw_vpc-node-label-updater.log
{"level":"error","timestamp":"2022-08-23T21:34:29.096Z","caller":"nodeupdater/utils.go:96","msg":"Failed to Get IAM access token","watcher-name":"vpc-node-label-updater","error":"Post \"https://iam.cloud.ibm.com/oidc/token\": dial tcp: lookup iam.cloud.ibm.com: i/o timeout"} {"level":"fatal","timestamp":"2022-08-23T21:34:29.096Z","caller":"cmd/main.go:140","msg":"Failed to read secret configuration from storage secret present in the cluster ","watcher-name":"vpc-node-label-updater","error":"Post \"https://iam.cloud.ibm.com/oidc/token\": dial tcp: lookup iam.cloud.ibm.com: i/o timeout"}
which appears to have been created multiple times by the storage operator (which could be the issue)
https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_installer/6056/pull-ci-openshift-installer-master-e2e-ibmcloud/1562148413049409536/artifacts/e2e-ibmcloud/gather-extra/artifacts/pods/openshift-cluster-csi-drivers_ibm-vpc-block-csi-driver-operator-97ccb4f8c-9jsqs_ibm-vpc-block-csi-driver-operator.log
I0823 20:41:24.863653 1 secretsync.go:125] storage-secret-store secret created successfully
I0823 20:41:25.711470 1 secretsync.go:125] storage-secret-store secret created successfully
I0823 20:41:26.799574 1 secretsync.go:125] storage-secret-store secret created successfully
I0823 20:41:27.722871 1 secretsync.go:125] storage-secret-store secret created successfully
I0823 20:42:39.697423 1 secretsync.go:125] storage-secret-store secret created successfully
I0823 20:42:58.094318 1 secretsync.go:125] storage-secret-store secret created successfully
I0823 20:42:58.921656 1 secretsync.go:125] storage-secret-store secret created successfully
I0823 20:42:59.793962 1 secretsync.go:125] storage-secret-store secret created successfully
I0823 20:43:45.413511 1 secretsync.go:125] storage-secret-store secret created successfully
I0823 20:43:46.469002 1 secretsync.go:125] storage-secret-store secret created successfully
I0823 20:51:39.789506 1 secretsync.go:125] storage-secret-store secret created successfully
I0823 20:53:44.446652 1 secretsync.go:125] storage-secret-store secret created successfully
I0823 20:53:45.449387 1 secretsync.go:125] storage-secret-store secret created successfully
I0823 21:03:44.473388 1 secretsync.go:125] storage-secret-store secret created successfully
I0823 21:03:45.360558 1 secretsync.go:125] storage-secret-store secret created successfully
I0823 21:13:44.428195 1 secretsync.go:125] storage-secret-store secret created successfully
I0823 21:13:45.388330 1 secretsync.go:125] storage-secret-store secret created successfully
I0823 21:23:44.369963 1 secretsync.go:125] storage-secret-store secret created successfully
I0823 21:23:45.321491 1 secretsync.go:125] storage-secret-store secret created successfully
I0823 21:33:44.301093 1 secretsync.go:125] storage-secret-store secret created successfully
I0823 21:33:45.276650 1 secretsync.go:125] storage-secret-store secret created successfully
- relates to
-
CORS-1513 IBM Public Cloud: Support IPI deployment
- Closed
- links to