-
Bug
-
Resolution: Done
-
Normal
-
None
-
4.18
-
Quality / Stability / Reliability
-
False
-
-
2
-
Moderate
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
Since https://issues.redhat.com/browse/OCPBUGS-36810, when the MOSC resource is deleted while the build is being executed, then all configmaps should be garbage collected. We should be able to create a new MOSC after this and the build should succeed.
When the MOSC resource is removed just before the build-rendered pod finishes, the build-rendered pod (in Terminating status) is able to create the "digest-rendered" configmap and this configmap is not garbage collected. If we create a new MOSC after this the build will fail because the new "digest-rendered" configmap will collide with the previous one.
Version-Release number of selected component (if applicable):
4.18.0-0.nightly-2024-09-26-093014
How reproducible:
Always
Steps to Reproduce:
1. Create CustomMCP
2. Apply any MOSC
3. Run this script to remove the build pod when he is about to finish
$ cat reproduce_script.sh
while true
do
oc logs -l machineconfiguration.openshift.io/rendered-machine-config -n openshift-machine-config-operator |grep "buildah push"
if [ "$?" == "0" ]
then
echo "Render build pod started to push the image"
sleep 5
echo "Remove the mosc"
oc delete machineosconfig --all
fi
sleep 1
done
Actual results:
The MOSB will be garbage collected, and all existing configmaps will be garbage collected. Nevertheless, the build pod will be in Terminating status for a while, and it will have time enough to create the "digest-rendered" configmap after all the garbage collection process has finished. The result is that the "digest-rendered" configmap is leaked and if we create a new MOSC the builds will fail because the new "digest-rendered" configmap will collide with the already existing one.
The result is this one
$ oc get pods,cm preserve-sregidor-work: Thu Sep 26 14:59:07 2024
NAME READY STATUS RESTARTS AGE
pod/kube-rbac-proxy-crio-ip-10-0-10-89.us-east-2.compute.internal 1/1 Running 4 4h50m
pod/kube-rbac-proxy-crio-ip-10-0-41-218.us-east-2.compute.internal 1/1 Running 4 4h49m
pod/kube-rbac-proxy-crio-ip-10-0-56-75.us-east-2.compute.internal 1/1 Running 3 4h44m
pod/kube-rbac-proxy-crio-ip-10-0-8-214.us-east-2.compute.internal 1/1 Running 4 3h51m
pod/kube-rbac-proxy-crio-ip-10-0-83-194.us-east-2.compute.internal 1/1 Running 4 4h50m
pod/machine-config-controller-68df57cdf-ljrk8 2/2 Running 0 98m
pod/machine-config-daemon-5gdj2 2/2 Running 2 (6m53s ago) 12m
pod/machine-config-daemon-cd7hg 2/2 Running 1 (9m56s ago) 12m
pod/machine-config-daemon-r97kw 2/2 Running 0 12m
pod/machine-config-daemon-wlbsq 2/2 Running 1 (9m47s ago) 12m
pod/machine-config-daemon-xpzf6 2/2 Running 4 (49s ago) 12m
pod/machine-config-operator-59746c6f74-wsfhq 2/2 Running 0 91m
pod/machine-config-server-8db8k 1/1 Running 1 4h47m
pod/machine-config-server-fkf4r 1/1 Running 1 4h47m
pod/machine-config-server-vlqfv 1/1 Running 1 4h47m
NAME DATA AGE
configmap/coreos-bootimages 4 4h50m
configmap/digest-rendered-infra-89d689d461245f3557da375e30cc5c90 1 12m <----- LEAKED
configmap/kube-rbac-proxy 1 4h50m
configmap/kube-root-ca.crt 1 4h51m
configmap/kubeconfig-data 1 4h47m
configmap/machine-config-operator-images 1 4h50m
configmap/machine-config-osimageurl 4 4h50m
configmap/openshift-service-ca.crt 1 4h51m
Expected results:
After we remove the MOSC resource all configmaps should be garbage collected and if we create a new MOSC later then the new build should not fail.
Additional info:
- is related to
-
OCPBUGS-36810 When MCOS is deleted in building state configmap resources related to MOSC are not deleted.
-
- Closed
-