-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.16
-
Moderate
-
None
-
False
-
Description of problem:
MissingMachineConfig alert is not triggered when the MCD fails because it can find the MC
Version-Release number of selected component (if applicable):
quay.io/openshift-release-dev/ocp-release:4.16.0-x86_64
How reproducible:
Always
Steps to Reproduce:
1. Create a MC to deploy any file in the worker MCP Get the name of the new rendered MC rendered-worker-bf829671270609af06e077311a39363e 9e4a1f5f4c7ef58082021ca40556c67f99062d0a 3.4.0 23s 2. When the first node starts updating, delete the new rendered MC oc delete mc rendered-worker-bf829671270609af06e077311a39363e
Actual results:
The node is rebooted, an error regarding the missing MC is reported in the logs but no alert is fired This is the logs: [2024-07-02T15:49:16Z INFO nmstatectl::persist_nic] /etc/systemd/network does not exist, no need to clean up I0702 15:49:16.392971 37219 daemon.go:1563] Previous boot ostree-finalize-staged.service appears successful E0702 15:49:16.393002 37219 writer.go:226] Marking Degraded due to: missing MachineConfig rendered-worker-bf829671270609af06e077311a39363e machineconfig.machineconfiguration.openshift.io "rendered-worker-bf829671270609af06e077311a39363e" not found I0702 15:49:45.368475 37219 certificate_writer.go:288] Certificate was synced from controllerconfig resourceVersion 193460 I0702 15:49:48.402624 37219 daemon.go:1899] Running: /run/machine-config-daemon-bin/nmstatectl persist-nic-names --root / --kargs-out /tmp/nmstate-kargs3944168395 --cleanup [2024-07-02T15:49:48Z INFO nmstatectl] Nmstate version: 2.2.29 We can see that no alert is fired: $ alias prometheusquery='function __lgb() { unset -f __lgb; oc rsh -n openshift-monitoring prometheus-k8s-0 curl -s -k -H "Authorization: Bearer $(oc -n openshift-monitoring create token prometheus-k8s)" --data-urlencode "query=$1" https://prometheus-k8s.openshift-monitoring.svc:9091/api/v1/query; }; __lgb' $alias alerts='curl -s -k -H "Authorization: Bearer $(oc -n openshift-monitoring create token prometheus-k8s)" https://$(oc get route -n openshift-monitoring alertmanager-main -o jsonpath={.spec.host})/api/v1/alerts | jq ' $ prometheusquery mcd_missing_mc {"status":"success","data":{"resultType":"vector","result":[]}} $ alerts |grep MissingMachineConfig
Expected results:
An MissingMachineConfig alert should be fired
Additional info:
- is related to
-
OCPBUGS-38733 rendered MachineConfig in use not recreated in OpenShift 4.16
- Verified
-
OCPBUGS-17788 OpenShift Container Platform 4.13.4 installation is failing because of rendered-master-${hash} not found
- Closed