-
Bug
-
Resolution: Done
-
Normal
-
None
-
4.14
-
No
-
False
-
Description of problem:
On ci/prow/e2e-gcp-ovn-techpreview jobs, we noticed that the installation would always fail. This also happens when launching jobs from ClusterBot with https://github.com/openshift/cloud-provider-gcp/pull/35 and the TechPreview feature flag set. A sample job is https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_cloud-provider-gcp/35/pull-ci-openshift-cloud-provider-gcp-master-e2e-gcp-ovn-techpreview/1700229284032942080 In the above job, the machine configs change from "rendered-worker-d5ca314d6412a630a0df72eb28b88543" to "rendered-worker-3eb45ca2b54008d2d1d5a6701a84bd7e", but no nodes are ever given "rendered-worker-3eb45ca2b54008d2d1d5a6701a84bd7e" as a desired configuration. When diff'ing the two configs, the only notable change is that `/etc/mco/internal-registry-pull-secret.json` changes from being empty to being populated. Within the Machine Config Operator controller logs of this job, we see the new machine config being generated, but not assigned. ``` ☸ ocp/api-ci-l2s4-p1-openshiftapps-com:6443/nbrubake (ocp) in Downloads/artifacts/pods ❯ rg rendered-worker-3 openshift-machine-config-operator_machine-config-controller-7c754bffdd-84k5s_machine-config-controller.log 184:I0908 19:59:48.796617 1 render_controller.go:510] Generated machineconfig rendered-worker-3eb45ca2b54008d2d1d5a6701a84bd7e from 7 configs: [{MachineConfig 00-worker machineconfiguration.openshift.io/v1 } {MachineConfig 01-worker-container-runtime machineconfiguration.openshift.io/v1 } {MachineConfig 01-worker-kubelet machineconfiguration.openshift.io/v1 } {MachineConfig 97-worker-generated-kubelet machineconfiguration.openshift.io/v1 } {MachineConfig 98-worker-generated-kubelet machineconfiguration.openshift.io/v1 } {MachineConfig 99-worker-generated-registries machineconfiguration.openshift.io/v1 } {MachineConfig 99-worker-ssh machineconfiguration.openshift.io/v1 }] 185:I0908 19:59:48.797076 1 event.go:298] Event(v1.ObjectReference{Kind:"MachineConfigPool", Namespace:"openshift-machine-config-operator", Name:"worker", UID:"3416d468-710b-4c19-a956-00dead3dec84", APIVersion:"machineconfiguration.openshift.io/v1", ResourceVersion:"25805", FieldPath:""}): type: 'Normal' reason: 'RenderedConfigGenerated' rendered-worker-3eb45ca2b54008d2d1d5a6701a84bd7e successfully generated (release version: 4.15.0-0.ci.test-2023-09-08-193239-ci-op-h57nt20x-latest, controller version: 5b821a279c88fee1cc1886a6cf1ec774891a2258) 187:I0908 19:59:48.872593 1 render_controller.go:536] Pool worker: now targeting: rendered-worker-3eb45ca2b54008d2d1d5a6701a84bd7e ``` We have _not_ seen this when deploying onto GCP manually, however. A similar ClusterBot failure is here: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-launch-gcp-modern/1702401781788577792
Version-Release number of selected component (if applicable):
How reproducible:
So far, 100% of the time with prow-based deployments
Steps to Reproduce:
1. Launch ci/prow/e2e-gcp-ovn-techpreview or create a GCP cluster with TechPreviewNoUpgrade from ClusterBot 2. Wait for the launch to fail 3.
Actual results:
Worker nodes get restarted and result in services such as olm, ingress, the image registry, and others to become unavailable
Expected results:
The image pull secret doesn't change during an install, or if it does, it doesn't result in stuck workers.
Additional info:
No Machines or MachineSets are populated in the gather-extras, but that's likely because this is meant to be a ClusterAPI-managed cluster, which uses a different API group than the gather scripts use. Also, there are worker nodes in the gather-extra, but they are marked as unschedulable due to missing network routes.
- blocks
-
OCPBUGS-5755 GCP XPN private cluster install attempts to add masters to k8s-ig-xxxx instance groups
- Closed
- relates to
-
OCPBUGS-18572 [gcp] installation with "featureSet: TechPreviewNoUpgrade" failed, possibly due to nodes getting taint - "node.kubernetes.io/network-unavailable"
- Closed