-
Bug
-
Resolution: Duplicate
-
Major
-
None
-
4.17.0
-
None
-
Important
-
None
-
Proposed
-
False
-
Description of problem:
When running a disconnected + private GCP CAPI cluster installation(example failed job https://qe-private-deck-ci.apps.ci.l2s4.p1.openshiftapps.com/view/gs/qe-private-deck/logs/periodic-ci-openshift-openshift-tests-private-release-4.17-multi-nightly-gcp-ipi-disc-priv-capi-amd-mixarch-f28-destructive/1818856018684153856), it failed with the following error: level=info msg=Cluster operator ingress Progressing is True with Reconciling: Not all ingress controllers are available.383level=error msg=Cluster operator ingress Degraded is True with IngressDegraded: The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: LoadBalancerReady=False (SyncLoadBalancerFailed: The service-controller component is reporting SyncLoadBalancerFailed events like: Error syncing load balancer: failed to ensure load balancer: googleapi: Error 400: Resource 'projects/XXXXXXXXXXXX/zones/us-central1-a/instances/ci-op-2033xdkq-530e8-qmljx-worker-a-qfcz9' is expected to be in the subnetwork 'projects/XXXXXXXXXXXX/regions/us-central1/subnetworks/ci-op-2033xdkq-530e8-master-subnet' but is in the subnetwork 'projects/XXXXXXXXXXXX/regions/us-central1/subnetworks/ci-op-2033xdkq-530e8-worker-subnet'., wrongSubnetwork384level=error msg=The cloud-controller-manager logs may contain more details.) In the cloud-controller-manager pod log, https://gcsweb-qe-private-deck-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/qe-private-deck/logs/periodic-ci-openshift-openshift-tests-private-release-4.17-multi-nightly-gcp-ipi-disc-priv-capi-amd-mixarch-f28-destructive/1818856018684153856/artifacts/gcp-ipi-disc-priv-capi-amd-mixarch-f28-destructive/gather-extra/artifacts/pods/openshift-cloud-controller-manager_gcp-cloud-controller-manager-9fc7bffdc-f5w5w_cloud-controller-manager.log I0801 05:55:37.949118 1 gce_loadbalancer_internal.go:612] ensureInternalInstanceGroup(k8s-ig--544be3f7b5733dc5, us-central1-a): adding nodes: [ci-op-2033xdkq-530e8-qmljx-worker-a-qfcz9] E0801 05:55:38.165808 1 gce_loadbalancer.go:206] Failed to EnsureLoadBalancer(ci-op-2033xdkq-530e8-qmljx, openshift-ingress, router-default, aab35500360fd4a6ab2c840364bb35d8, us-central1), err: googleapi: Error 400: Resource 'projects/openshift-qe/zones/us-central1-a/instances/ci-op-2033xdkq-530e8-qmljx-worker-a-qfcz9' is expected to be in the subnetwork 'projects/openshift-qe/regions/us-central1/subnetworks/ci-op-2033xdkq-530e8-master-subnet' but is in the subnetwork 'projects/openshift-qe/regions/us-central1/subnetworks/ci-op-2033xdkq-530e8-worker-subnet'., wrongSubnetwork E0801 05:55:38.165878 1 controller.go:298] error processing service openshift-ingress/router-default (retrying with exponential backoff): failed to ensure load balancer: googleapi: Error 400: Resource 'projects/openshift-qe/zones/us-central1-a/instances/ci-op-2033xdkq-530e8-qmljx-worker-a-qfcz9' is expected to be in the subnetwork 'projects/openshift-qe/regions/us-central1/subnetworks/ci-op-2033xdkq-530e8-master-subnet' but is in the subnetwork 'projects/openshift-qe/regions/us-central1/subnetworks/ci-op-2033xdkq-530e8-worker-subnet'., wrongSubnetwork I0801 05:55:38.165986 1 event.go:389] "Event occurred" object="openshift-ingress/router-default" fieldPath="" kind="Service" apiVersion="v1" type="Warning" reason="SyncLoadBalancerFailed" message="Error syncing load balancer: failed to ensure load balancer: googleapi: Error 400: Resource 'projects/openshift-qe/zones/us-central1-a/instances/ci-op-2033xdkq-530e8-qmljx-worker-a-qfcz9' is expected to be in the subnetwork 'projects/openshift-qe/regions/us-central1/subnetworks/ci-op-2033xdkq-530e8-master-subnet' but is in the subnetwork 'projects/openshift-qe/regions/us-central1/subnetworks/ci-op-2033xdkq-530e8-worker-subnet'., wrongSubnetwork" it has the same error about worker is in the worker subnet, but expected it to be located in the master subnet. If the above prow job link is not available for you, please see https://drive.google.com/drive/folders/1ftukRDUR6hBYPvwpwBJRYNZDxoUIVZWR?usp=drive_link for the must-gather logs collected for another same failure job https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-install/300573/
Version-Release number of selected component (if applicable):
4.17.0-0.nightly-multi-2024-07-31-212714
How reproducible:
Always for CAPI install, confirmed it's working well in Terraform with the same cluster configuration
Steps to Reproduce:
1. 2. 3.
Actual results:
Expected results:
Additional info: