-
Bug
-
Resolution: Done-Errata
-
Normal
-
4.19.0
-
Quality / Stability / Reliability
-
False
-
-
3
-
Important
-
No
-
None
-
Proposed
-
MCO Sprint 266, MCO Sprint 267, MCO Sprint 268, MCO Sprint 269
-
4
-
In Progress
-
Release Note Not Required
-
N/A
-
None
-
None
-
None
-
None
Description of problem:
In some scenarios when we interrupt a MOSB and we use the rebuild label in the MOSC to rebuild the interrupted MOSB, the image is never rebuilt.
Version-Release number of selected component (if applicable):
4.19
How reproducible:
Always
Steps to Reproduce:
We add here the steps using the new API, it should be the same with the old API 1. Create an infra pool apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfigPool metadata: name: infra spec: machineConfigSelector: matchExpressions: - {key: machineconfiguration.openshift.io/role, operator: In, values: [worker,infra]} nodeSelector: matchLabels: node-role.kubernetes.io/infra: "" 2. Create a MOSC for the infra pool oc create -f - << EOF apiVersion: machineconfiguration.openshift.io/v1 kind: MachineOSConfig metadata: name: mosc-infra spec: machineConfigPool: name: infra currentImagePullSecret: name: $(oc get secret -n openshift-config pull-secret -o json | jq "del(.metadata.namespace, .metadata.creationTimestamp, .metadata.resourceVersion, .metadata.uid, .metadata.name)" | jq '.metadata.name="pull-copy"' | oc -n openshift-machine-config-operator create -f - &> /dev/null; echo -n "pull-copy") imageBuilder: imageBuilderType: Job baseImagePullSecret: name: $(oc get secret -n openshift-config pull-secret -o json | jq "del(.metadata.namespace, .metadata.creationTimestamp, .metadata.resourceVersion, .metadata.uid, .metadata.name)" | jq '.metadata.name="pull-copy"' | oc -n openshift-machine-config-operator create -f - &> /dev/null; echo -n "pull-copy") renderedImagePushSecret: name: $(oc get secret -n openshift-config pull-secret -o json | jq "del(.metadata.namespace, .metadata.creationTimestamp, .metadata.resourceVersion, .metadata.uid, .metadata.name)" | jq '.metadata.name="pull-copy"' | oc -n openshift-machine-config-operator create -f - &> /dev/null; echo -n "pull-copy") renderedImagePushSpec: "quay.io/mcoqe/layering:ocl" EOF 3. Wait for the image to be created 4. Delete the MOSC resource created in step 2 5. Delete the infra pool 6. Create a MOSC for the worker pool oc create -f - << EOF apiVersion: machineconfiguration.openshift.io/v1 kind: MachineOSConfig metadata: name: mosc-worker spec: machineConfigPool: name: worker currentImagePullSecret: name: $(oc get secret -n openshift-config pull-secret -o json | jq "del(.metadata.namespace, .metadata.creationTimestamp, .metadata.resourceVersion, .metadata.uid, .metadata.name)" | jq '.metadata.name="pull-copy"' | oc -n openshift-machine-config-operator create -f - &> /dev/null; echo -n "pull-copy") imageBuilder: imageBuilderType: Job baseImagePullSecret: name: $(oc get secret -n openshift-config pull-secret -o json | jq "del(.metadata.namespace, .metadata.creationTimestamp, .metadata.resourceVersion, .metadata.uid, .metadata.name)" | jq '.metadata.name="pull-copy"' | oc -n openshift-machine-config-operator create -f - &> /dev/null; echo -n "pull-copy") renderedImagePushSecret: name: $(oc get secret -n openshift-config pull-secret -o json | jq "del(.metadata.namespace, .metadata.creationTimestamp, .metadata.resourceVersion, .metadata.uid, .metadata.name)" | jq '.metadata.name="pull-copy"' | oc -n openshift-machine-config-operator create -f - &> /dev/null; echo -n "pull-copy") renderedImagePushSpec: "quay.io/mcoqe/layering:ocl" EOF 7. Wait until the new Job is created and delete it to interrupt the MOSB resource $ oc get machineosbuild NAME PREPARED BUILDING SUCCEEDED INTERRUPTED FAILED AGE mosc-infra-b9e2aca7838fb9be42ee2755c9ff35fc False False True False False 8m mosc-worker-2b5aaa0c9933f34e763039e753b7aefa False True False False False 37s 8. Add the rebuild label to rebuild the interrupted MOSB $ oc patch machineosconfig mosc-worker --type json -p '[{"op": "add", "path": "/metadata/annotations/machineconfiguration.openshift.io~1rebuild", "value":""}]' machineosconfig.machineconfiguration.openshift.io/mosc-worker patched
Actual results:
A new job is triggered, and then it is immediately terminated $ l job NAME STATUS COMPLETIONS DURATION AGE build-mosc-worker-2b5aaa0c9933f34e763039e753b7aefa Terminating 0/1 21s 21s The MOSB resources is recreated, but it is immediately reported as Interrupted again $ oc get machineosbuild machineosbuild NAME PREPARED BUILDING SUCCEEDED INTERRUPTED FAILED AGE mosc-infra-b9e2aca7838fb9be42ee2755c9ff35fc False False True False False 47m mosc-worker-2b5aaa0c9933f34e763039e753b7aefa False False False True False 119s
Expected results:
The MOSB resource should be rebuilt without problems
Additional info:
- relates to
-
OCPBUGS-48675 In OCL. Error rebuilding a failed build after fixing the failure root cause
-
- Closed
-
- links to
-
RHEA-2024:11038 OpenShift Container Platform 4.19.z bug fix update