-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.18
-
Moderate
-
None
-
False
-
Description of problem:
Since https://issues.redhat.com/browse/OCPBUGS-36810, when the MOSC resource is deleted while the build is being executed, then all configmaps should be garbage collected. We should be able to create a new MOSC after this and the build should succeed. When the MOSC resource is removed just before the build-rendered pod finishes, the build-rendered pod (in Terminating status) is able to create the "digest-rendered" configmap and this configmap is not garbage collected. If we create a new MOSC after this the build will fail because the new "digest-rendered" configmap will collide with the previous one.
Version-Release number of selected component (if applicable):
4.18.0-0.nightly-2024-09-26-093014
How reproducible:
Always
Steps to Reproduce:
1. Create CustomMCP 2. Apply any MOSC 3. Run this script to remove the build pod when he is about to finish $ cat reproduce_script.sh while true do oc logs -l machineconfiguration.openshift.io/rendered-machine-config -n openshift-machine-config-operator |grep "buildah push" if [ "$?" == "0" ] then echo "Render build pod started to push the image" sleep 5 echo "Remove the mosc" oc delete machineosconfig --all fi sleep 1 done
Actual results:
The MOSB will be garbage collected, and all existing configmaps will be garbage collected. Nevertheless, the build pod will be in Terminating status for a while, and it will have time enough to create the "digest-rendered" configmap after all the garbage collection process has finished. The result is that the "digest-rendered" configmap is leaked and if we create a new MOSC the builds will fail because the new "digest-rendered" configmap will collide with the already existing one. The result is this one $ oc get pods,cm preserve-sregidor-work: Thu Sep 26 14:59:07 2024 NAME READY STATUS RESTARTS AGE pod/kube-rbac-proxy-crio-ip-10-0-10-89.us-east-2.compute.internal 1/1 Running 4 4h50m pod/kube-rbac-proxy-crio-ip-10-0-41-218.us-east-2.compute.internal 1/1 Running 4 4h49m pod/kube-rbac-proxy-crio-ip-10-0-56-75.us-east-2.compute.internal 1/1 Running 3 4h44m pod/kube-rbac-proxy-crio-ip-10-0-8-214.us-east-2.compute.internal 1/1 Running 4 3h51m pod/kube-rbac-proxy-crio-ip-10-0-83-194.us-east-2.compute.internal 1/1 Running 4 4h50m pod/machine-config-controller-68df57cdf-ljrk8 2/2 Running 0 98m pod/machine-config-daemon-5gdj2 2/2 Running 2 (6m53s ago) 12m pod/machine-config-daemon-cd7hg 2/2 Running 1 (9m56s ago) 12m pod/machine-config-daemon-r97kw 2/2 Running 0 12m pod/machine-config-daemon-wlbsq 2/2 Running 1 (9m47s ago) 12m pod/machine-config-daemon-xpzf6 2/2 Running 4 (49s ago) 12m pod/machine-config-operator-59746c6f74-wsfhq 2/2 Running 0 91m pod/machine-config-server-8db8k 1/1 Running 1 4h47m pod/machine-config-server-fkf4r 1/1 Running 1 4h47m pod/machine-config-server-vlqfv 1/1 Running 1 4h47m NAME DATA AGE configmap/coreos-bootimages 4 4h50m configmap/digest-rendered-infra-89d689d461245f3557da375e30cc5c90 1 12m <----- LEAKED configmap/kube-rbac-proxy 1 4h50m configmap/kube-root-ca.crt 1 4h51m configmap/kubeconfig-data 1 4h47m configmap/machine-config-operator-images 1 4h50m configmap/machine-config-osimageurl 4 4h50m configmap/openshift-service-ca.crt 1 4h51m
Expected results:
After we remove the MOSC resource all configmaps should be garbage collected and if we create a new MOSC later then the new build should not fail.
Additional info:
- is related to
-
OCPBUGS-36810 When MCOS is deleted in building state configmap resources related to MOSC are not deleted.
- Verified