-
Bug
-
Resolution: Duplicate
-
Normal
-
None
-
4.13
-
None
-
No
-
False
-
Description of problem:
We can migrate a cluster created with a single arch payload to a multi payload using the `oc adm upgrade --to-multi-arch` command. While the migration is happening, if we simultaneously provision a machineset (with the appropriate arch specific bootimage) of a different architecture (differing from the control plane arch), the machine stays in provisioned state for ever and no node is created. This is due to the fact that the machine does get created and boots up, but when MCO pivots to the machine-os-content, it pivots to the single arch machine-os because the upgrade is not complete yet. While this case is indeed rare, it would be great if somehow this error could be propagated out of the machine and it would transition to a failed state
Version-Release number of selected component (if applicable):
4.13.0-ec.4
How reproducible:
always
Steps to Reproduce:
1.Create an AWS amd64 cluster with a single arch 4.13.0-ec.4 payload 2. Execute the migration command `oc adm upgrade --to-multi-arch` 3. Provision a machineset with the arm64 bootimage and instance type 4. monitor with `oc get machines -n openshift-machine-api`
Actual results:
machine stays in provisioned state
Expected results:
machine transitions to failed state
Additional info:
logs from machine-api-controller: I0314 23:48:15.684256 1 reconciler.go:407] psundara-mycluster01-8mrjn-worker-us-east-1c-m5nlr: ProviderID already set in the machine Spec with value:aws:///us-east-1c/i-0c0d4fb0fe65f651d I0314 23:48:15.684315 1 reconciler.go:267] Updated machine psundara-mycluster01-8mrjn-worker-us-east-1c-m5nlr I0314 23:48:15.684323 1 machine_scope.go:167] psundara-mycluster01-8mrjn-worker-us-east-1c-m5nlr: Updating status I0314 23:48:15.761009 1 machine_scope.go:193] psundara-mycluster01-8mrjn-worker-us-east-1c-m5nlr: finished calculating AWS status I0314 23:48:15.761027 1 machine_scope.go:90] psundara-mycluster01-8mrjn-worker-us-east-1c-m5nlr: patching machine I0314 23:48:15.778043 1 controller.go:341] psundara-mycluster01-8mrjn-worker-us-east-1c-m5nlr: has no node yet, requeuing
- relates to
-
OTA-961 Prepare for Cluster version status to report transitions from single arch to multi arch correctly
- Dev Complete