Uploaded image for project: 'Machine Config Operator'
  1. Machine Config Operator
  2. MCO-1537

Add runbook for MCDRebootError alert

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • 0
    • 0

      MCC sends drain alert when a node fails to reboot in a span of 5 minutes This is to make sure that admin takes appropriate action if required by looking at the pod logs. Alert contains the information on where to look for the logs.

      Example alert looks like:

       Reboot failed on {{ $labels.node }} , update may be blocked. For more details:  oc logs -f -n {{ $labels.namespace }} {{ $labels.pod }} -c machine-config-daemon

      It is possible that admin may not be able to interpret exact action to be taken after looking at MCC pod logs. Adding runbook (https://github.com/openshift/runbooks) can help admin in better troubleshooting and taking appropriate action.

       

       

      Acceptance Criteria:

      • Runbook doc is created for MCDRebootError alert
      • Created runbook link is accessible to cluster admin with MCDRebootError alert

       

              rhn-support-cruhm Courtney Ruhm
              rhn-support-cruhm Courtney Ruhm
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: