Loading...

XML

Word

Printable

Type: Bug
Resolution: Duplicate
Priority: Major
Fix Version/s: None
Affects Version/s: 4.17.0
Component/s: Cloud Compute / Cloud Controller Manager
Labels:
None

Severity:
Important
Regression:
None
Release Blocker:
Proposed
Blocked:
False
Blocked Reason:

Hide

None

Show
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Description of problem:

When running a disconnected + private GCP CAPI cluster installation(example failed job https://qe-private-deck-ci.apps.ci.l2s4.p1.openshiftapps.com/view/gs/qe-private-deck/logs/periodic-ci-openshift-openshift-tests-private-release-4.17-multi-nightly-gcp-ipi-disc-priv-capi-amd-mixarch-f28-destructive/1818856018684153856), it failed with the following error:

level=info msg=Cluster operator ingress Progressing is True with Reconciling: Not all ingress controllers are available.383level=error msg=Cluster operator ingress Degraded is True with IngressDegraded: The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: LoadBalancerReady=False (SyncLoadBalancerFailed: The service-controller component is reporting SyncLoadBalancerFailed events like: Error syncing load balancer: failed to ensure load balancer: googleapi: Error 400: Resource 'projects/XXXXXXXXXXXX/zones/us-central1-a/instances/ci-op-2033xdkq-530e8-qmljx-worker-a-qfcz9' is expected to be in the subnetwork 'projects/XXXXXXXXXXXX/regions/us-central1/subnetworks/ci-op-2033xdkq-530e8-master-subnet' but is in the subnetwork 'projects/XXXXXXXXXXXX/regions/us-central1/subnetworks/ci-op-2033xdkq-530e8-worker-subnet'., wrongSubnetwork384level=error msg=The cloud-controller-manager logs may contain more details.)


In the cloud-controller-manager pod log, 
https://gcsweb-qe-private-deck-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/qe-private-deck/logs/periodic-ci-openshift-openshift-tests-private-release-4.17-multi-nightly-gcp-ipi-disc-priv-capi-amd-mixarch-f28-destructive/1818856018684153856/artifacts/gcp-ipi-disc-priv-capi-amd-mixarch-f28-destructive/gather-extra/artifacts/pods/openshift-cloud-controller-manager_gcp-cloud-controller-manager-9fc7bffdc-f5w5w_cloud-controller-manager.log

I0801 05:55:37.949118       1 gce_loadbalancer_internal.go:612] ensureInternalInstanceGroup(k8s-ig--544be3f7b5733dc5, us-central1-a): adding nodes: [ci-op-2033xdkq-530e8-qmljx-worker-a-qfcz9]
E0801 05:55:38.165808       1 gce_loadbalancer.go:206] Failed to EnsureLoadBalancer(ci-op-2033xdkq-530e8-qmljx, openshift-ingress, router-default, aab35500360fd4a6ab2c840364bb35d8, us-central1), err: googleapi: Error 400: Resource 'projects/openshift-qe/zones/us-central1-a/instances/ci-op-2033xdkq-530e8-qmljx-worker-a-qfcz9' is expected to be in the subnetwork 'projects/openshift-qe/regions/us-central1/subnetworks/ci-op-2033xdkq-530e8-master-subnet' but is in the subnetwork 'projects/openshift-qe/regions/us-central1/subnetworks/ci-op-2033xdkq-530e8-worker-subnet'., wrongSubnetwork
E0801 05:55:38.165878       1 controller.go:298] error processing service openshift-ingress/router-default (retrying with exponential backoff): failed to ensure load balancer: googleapi: Error 400: Resource 'projects/openshift-qe/zones/us-central1-a/instances/ci-op-2033xdkq-530e8-qmljx-worker-a-qfcz9' is expected to be in the subnetwork 'projects/openshift-qe/regions/us-central1/subnetworks/ci-op-2033xdkq-530e8-master-subnet' but is in the subnetwork 'projects/openshift-qe/regions/us-central1/subnetworks/ci-op-2033xdkq-530e8-worker-subnet'., wrongSubnetwork
I0801 05:55:38.165986       1 event.go:389] "Event occurred" object="openshift-ingress/router-default" fieldPath="" kind="Service" apiVersion="v1" type="Warning" reason="SyncLoadBalancerFailed" message="Error syncing load balancer: failed to ensure load balancer: googleapi: Error 400: Resource 'projects/openshift-qe/zones/us-central1-a/instances/ci-op-2033xdkq-530e8-qmljx-worker-a-qfcz9' is expected to be in the subnetwork 'projects/openshift-qe/regions/us-central1/subnetworks/ci-op-2033xdkq-530e8-master-subnet' but is in the subnetwork 'projects/openshift-qe/regions/us-central1/subnetworks/ci-op-2033xdkq-530e8-worker-subnet'., wrongSubnetwork"

it has the same error about worker is in the worker subnet, but expected it to be located in the master subnet. 

If the above prow job link is not available for you, please see https://drive.google.com/drive/folders/1ftukRDUR6hBYPvwpwBJRYNZDxoUIVZWR?usp=drive_link for the must-gather logs collected for another same failure job https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-install/300573/

Version-Release number of selected component (if applicable):

 4.17.0-0.nightly-multi-2024-07-31-212714

How reproducible:

   Always for CAPI install, confirmed it's working well in Terraform with the same cluster configuration

Steps to Reproduce:

    1.
    2.
    3.

Actual results:

Expected results:

Additional info:

Assignee:: Joel Speed

Reporter:: Gaoyun Pei

QA Contact:: Zhaohua Sun

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2024/08/01 7:26 AM

Updated:: 2024/08/02 1:15 AM

Resolved:: 2024/08/02 1:15 AM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates