-
Bug
-
Resolution: Done-Errata
-
Normal
-
4.20
-
Quality / Stability / Reliability
-
False
-
-
1
-
Moderate
-
None
-
None
-
None
-
MCO Sprint 271
-
1
-
In Progress
-
Release Note Not Required
-
None
-
None
-
None
-
None
-
None
This is a clone of issue OCPBUGS-53390. The following is the description of the original issue:
—
Description of problem:
When a MOSB fails to build the OCL image, and we delete the job in order to interrupt it the MOSB is never reporting an "Interrupted=true" status.
Version-Release number of selected component (if applicable):
Using IPI on AWS
4.19.0-0.nightly-2025-03-20-062111
How reproducible:
Always
Steps to Reproduce:
1. Create a MOSC resource to enable OCL using a wrong Containerfile to force the build failure
oc create -f - << EOF
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineOSConfig
metadata:
name: worker-mosc
spec:
machineConfigPool:
name: worker
imageBuilder:
imageBuilderType: Job
baseImagePullSecret:
name: $(oc get secret -n openshift-config pull-secret -o json | jq "del(.metadata.namespace, .metadata.creationTimestamp, .metadata.resourceVersion, .metadata.uid, .metadata.name)" | jq '.metadata.name="pull-copy"' | oc -n openshift-machine-config-operator create -f - &> /dev/null; echo -n "pull-copy")
renderedImagePushSecret:
name: $(oc get -n openshift-machine-config-operator sa builder -ojsonpath='{.secrets[0].name}')
renderedImagePushSpec: "image-registry.openshift-image-registry.svc:5000/openshift-machine-config-operator/ocb-image:latest"
containerFile:
- content: |-
RUN wrong-command-touch /etc/test.test
EOF
2. Wait for the builder pods to start failing
$ oc get machineosbuild
NAME PREPARED BUILDING SUCCEEDED INTERRUPTED FAILED AGE
worker-mosc-5c23b42f7ccfeedb511145c2eb487c2b False True False False False 3m10s
$ oc get pods
NAME READY STATUS RESTARTS AGE
build-worker-mosc-5c23b42f7ccfeedb511145c2eb487c2b-24slm 0/2 Error 0 3m10s
build-worker-mosc-5c23b42f7ccfeedb511145c2eb487c2b-2p4xv 2/2 Running 0 14s
kube-rbac-proxy-crio-ip-10-0-11-170.us-east-2.compute.internal 1/1 Running 3 (129m ago) 129m
2. Remove the Job to interrupt the MOSB
$ oc get job
NAME STATUS COMPLETIONS DURATION AGE
build-worker-mosc-5c23b42f7ccfeedb511145c2eb487c2b Running 0/1 3m59s 3m59s
$ oc delete job build-worker-mosc-5c23b42f7ccfeedb511145c2eb487c2b
job.batch "build-worker-mosc-5c23b42f7ccfeedb511145c2eb487c2b" deleted
Actual results:
The MOSB resource is stuck in "Building" status
$ oc get machineosbuild
NAME PREPARED BUILDING SUCCEEDED INTERRUPTED FAILED AGE
worker-mosc-5c23b42f7ccfeedb511145c2eb487c2b False True False False False 4m39s
Expected results:
When the Job is removed the MOSB resource should report an "Interrupted=true" status
Additional info:
- clones
-
OCPBUGS-53390 In OCL. Failing MOSBs cannot be interrupted
-
- Closed
-
- is blocked by
-
OCPBUGS-53390 In OCL. Failing MOSBs cannot be interrupted
-
- Closed
-
- links to
-
RHEA-2024:11038
OpenShift Container Platform 4.19.z bug fix update