Uploaded image for project: 'Machine Config Operator'
  1. Machine Config Operator
  2. MCO-88

Add runbook for MCCDrainError alert

    XMLWordPrintable

Details

    • Story
    • Resolution: Unresolved
    • Major
    • None
    • None
    • Sprint 229
    • 0
    • 0

    Description

      Description:

      MCC sends drain alert when node drain doesn't succeed within drain timeout period (1 hour today). This is to make sure that admin takes appropriate action if required by looking at MCC pod logs. Alert contains the information on where to look for the logs.

      Example alert looks like:

      Drain failed on Node <node_name>, updates may be blocked. For more details: oc logs -f -n openshift-machine-config-operator machine-config-controller-xxxxx -c machine-config-controller

      It is possible that admin may not be able to interpret exact action to be taken after looking at MCC pod logs. Adding runbook (https://github.com/openshift/runbooks) can help admin in better troubleshooting and taking appropriate action.

       

      Acceptance Criteria:

      • Runbook doc is created for MCCDrainError alert
      • Created runbook link is accessible to cluster admin with MCCDrainError alert

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              rhn-engineering-skumari Sinny Kumari
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: