Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-36433

MissingMachineConfig alert not fired

XMLWordPrintable

    • Moderate
    • None
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      MissingMachineConfig alert is not triggered when the MCD fails because it can find the MC
          

      Version-Release number of selected component (if applicable):

      quay.io/openshift-release-dev/ocp-release:4.16.0-x86_64
          

      How reproducible:

      Always
          

      Steps to Reproduce:

          1. Create a MC to deploy any file in the worker MCP
          
          Get the name of the new rendered MC
          
          rendered-worker-bf829671270609af06e077311a39363e 9e4a1f5f4c7ef58082021ca40556c67f99062d0a   3.4.0             23s
          
          2. When the first node starts updating, delete the new rendered MC
          
          oc delete mc rendered-worker-bf829671270609af06e077311a39363e
          
          
          

      Actual results:

      	The node is rebooted, an error regarding the missing MC is reported in the logs but no alert is fired
      
              This is the logs:
      	
      [2024-07-02T15:49:16Z INFO  nmstatectl::persist_nic] /etc/systemd/network does not exist, no need to clean up
      I0702 15:49:16.392971   37219 daemon.go:1563] Previous boot ostree-finalize-staged.service appears successful
      E0702 15:49:16.393002   37219 writer.go:226] Marking Degraded due to: missing MachineConfig rendered-worker-bf829671270609af06e077311a39363e
      machineconfig.machineconfiguration.openshift.io "rendered-worker-bf829671270609af06e077311a39363e" not found
      I0702 15:49:45.368475   37219 certificate_writer.go:288] Certificate was synced from controllerconfig resourceVersion 193460
      I0702 15:49:48.402624   37219 daemon.go:1899] Running: /run/machine-config-daemon-bin/nmstatectl persist-nic-names --root / --kargs-out /tmp/nmstate-kargs3944168395 --cleanup
      [2024-07-02T15:49:48Z INFO  nmstatectl] Nmstate version: 2.2.29
      
      
      
              We can see that no alert is fired:
      
      	$ alias prometheusquery='function __lgb() { unset -f __lgb; oc rsh -n openshift-monitoring prometheus-k8s-0 curl -s -k  -H "Authorization: Bearer $(oc -n openshift-monitoring create token prometheus-k8s)" --data-urlencode "query=$1" https://prometheus-k8s.openshift-monitoring.svc:9091/api/v1/query; }; __lgb'
      	$alias alerts='curl -s -k -H "Authorization: Bearer $(oc -n openshift-monitoring create token prometheus-k8s)" https://$(oc get route -n openshift-monitoring alertmanager-main -o jsonpath={.spec.host})/api/v1/alerts | jq '
      
      
      	$ prometheusquery mcd_missing_mc
      {"status":"success","data":{"resultType":"vector","result":[]}}
      	$ alerts |grep MissingMachineConfig
          

      Expected results:

      	An MissingMachineConfig alert should be fired
          

      Additional info:

      
          

              Unassigned Unassigned
              sregidor@redhat.com Sergio Regidor de la Rosa
              Sergio Regidor de la Rosa Sergio Regidor de la Rosa
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: