-
Bug
-
Resolution: Done-Errata
-
Normal
-
4.14.0
-
Moderate
-
No
-
MCO Sprint 247, MCO Sprint 248, MCO Sprint 249, MCO Sprint 250, MCO Sprint 251
-
5
-
False
-
-
-
Bug Fix
-
Done
Description of problem:
When a MCP has the on-cluster-build functionality enabled, when we configure a valid imageBuilderType in the on-cluster-build configmap, and later on we update this configmap with an invalid imageBuilderType the machine-config ClusterOperator is not degraded.
Version-Release number of selected component (if applicable):
$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.14.0-0.nightly-2023-09-12-195514 True False 3h56m Cluster version is 4.14.0-0.nightly-2023-09-12-195514
How reproducible:
Always
Steps to Reproduce:
1. Create a valid OCB configmap, and 2 valid secrets. Like this: apiVersion: v1 data: baseImagePullSecretName: mco-global-pull-secret finalImagePullspec: quay.io/mcoqe/layering finalImagePushSecretName: mco-test-push-secret imageBuilderType: "" kind: ConfigMap metadata: creationTimestamp: "2023-09-13T15:10:37Z" name: on-cluster-build-config namespace: openshift-machine-config-operator resourceVersion: "131053" uid: 1e0c66de-7a9a-4787-ab98-ce987a846f66 3. Label the "worker" MCP in order to enable the OCB functionality in it. $ oc label mcp/worker machineconfiguration.openshift.io/layering-enabled= 4. Wait for the machine-os-builder pod to be created, and for the build to be finished. Just the wait for the pods, do not wait for the MCPs to be updated. As soon as the build pod has finished the build, go to step 5. 5. Patch the on-cluster-build configmap to use a valid imageBuilderType oc patch cm/on-cluster-build-config -n openshift-machine-config-operator -p '{"data":{"imageBuilderType": "fake"}}'
Actual results:
The machine-os-builder pod crashes $ oc get pods NAME READY STATUS RESTARTS AGE machine-config-controller-5bdd7b66c5-6l7sz 2/2 Running 2 (45m ago) 63m machine-config-daemon-5ttqh 2/2 Running 0 63m machine-config-daemon-l95rj 2/2 Running 0 63m machine-config-daemon-swtc6 2/2 Running 2 57m machine-config-daemon-vq594 2/2 Running 2 57m machine-config-daemon-zrf4f 2/2 Running 0 63m machine-config-operator-7dd564556d-9smk4 2/2 Running 2 (45m ago) 65m machine-config-server-9sxjv 1/1 Running 0 62m machine-config-server-m5sdl 1/1 Running 0 62m machine-config-server-zb2hr 1/1 Running 0 62m machine-os-builder-6cfbd8d5d-t6g8w 0/1 CrashLoopBackOff 6 (3m11s ago) 9m16s But the machine-config ClusterOperator is not degraded $ oc get co machine-config NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE machine-config 4.14.0-0.nightly-2023-09-12-195514 True False False 63m
Expected results:
The machine-config ClusterOperator should become degraded when an invalid imageBuilderType is configured.
Additional info:
If we configure an invalid imageBuilderType directly (not by patching/editing the configmap), then the machine-config CO is degraded, but when we edit the configmap it is not. A link to the must-gather file is provided in the first comment in this issue PS: If we wait for the MCPs to be updated in step 4, the machine-os-builder pod is not restarted with the new "fake" imageBuilderType, but the machine-config CO is not degraded either, and it should. Does it make sense?
- links to
-
RHEA-2024:0041 OpenShift Container Platform 4.16.z bug fix update