Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-17963

Machine-config operator degraded with MachineConfigControllerFailed

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Moderate
    • No
    • None
    • None
    • None
    • MCO Sprint 241, MCO Sprint 242, MCO Sprint 243, MCO Sprint 244
    • 4
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      Machine-config operator found in degraded state with MachineConfigControllerFailed reason.
      
      ~~~
        - lastTransitionTime: '2023-08-14T01:54:52Z'
          message: 'Failed to resync 4.12.26 because: error during waitForControllerConfigToBeCompleted:
            [timed out waiting for the condition, controllerconfig is not completed: ControllerConfig
            has not completed: completed(false) running(false) failing(true)]'
          reason: MachineConfigControllerFailed
          status: 'True'
          type: Degraded
      ~~~
      
      machine-config-controller logs: Controller is failing to read directory
      
      ~~~
      2023-08-14T04:29:47.125008119Z I0814 04:29:47.124988       1 render_controller.go:377] Error syncing machineconfigpool worker: ControllerConfig has not completed: completed(false) running(false) failing(true)
      2023-08-14T04:30:03.142875064Z E0814 04:30:03.142826       1 template_controller.go:426] failed to read dir "/etc/mcc/templates": readdirent /etc/mcc/templates: no such file or directory
      2023-08-14T04:30:03.142875064Z I0814 04:30:03.142851       1 template_controller.go:427] Dropping controllerconfig "machine-config-controller" out of the queue: failed to read dir "/etc/mcc/templates": readdirent /etc/mcc/templates: no such file or directory
      2023-08-14T04:30:27.948653516Z I0814 04:30:27.948609       1 container_runtime_config_controller.go:364] Error syncing image config openshift-config: could not Create/Update MachineConfig: could not generate original ContainerRuntime Configs: generateMachineConfigsforRole failed with error failed to read dir "/etc/mcc/templates/master": readdirent /etc/mcc/templates/master: no such file or directory
      ~~~
      
      Cu recently upgraded cluster from  4.10.61 to 4.12.26. On running cluster cu observed that ,machine-config, network and sample operator stuck in degraded state.
      
      Issue resolved by deleting machine-config-controller controllerconfig, network and sample operator pods. 
      
      We are looking for RCA what caused the issue. Not performed pruning activity on cluster

      Version-Release number of selected component (if applicable):

      4.12.26

      How reproducible:

       

      Steps to Reproduce:

      1.
      2.
      3.
      

      Actual results:

       

      Expected results:

       

      Additional info:

       

              jerzhang@redhat.com Yu Qi Zhang
              rhn-support-aksjadha Akshata Jadhav
              Yu Qi Zhang
              None
              Sergio Regidor de la Rosa Sergio Regidor de la Rosa
              None
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: