Uploaded image for project: 'Red Hat Workload Availability'
  1. Red Hat Workload Availability
  2. RHWA-216

SNR mitigation handling healthy nodes with Out Of Service Taint

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False
    • Hide
      Cause: Sometimes, stale workload stays on healthy nodes, and the workload is not removed by the out-of-service taint.
      Consequence: The out-of-service taint is kept, and it affects the usage of the healthy node.
      Fix: Add an expiry mechanism to out-of-service taint.
      Result: The out-of-service taint is cleaned up even if workloads are stuck terminating.
      Show
      Cause: Sometimes, stale workload stays on healthy nodes, and the workload is not removed by the out-of-service taint. Consequence: The out-of-service taint is kept, and it affects the usage of the healthy node. Fix: Add an expiry mechanism to out-of-service taint. Result: The out-of-service taint is cleaned up even if workloads are stuck terminating.
    • Feature
    • Proposed

      We had couple of occurrences [see examples below * ] where customers ran into situation where SNR remediated a node but the OOS taint remained on the node.
      This ticket is about looking into this and suggest a mitigation if possible.

      [ * ]Related issues:
      CNV-57804
      RHWA-6

              mshitrit@redhat.com Michael Shitrit
              mshitrit@redhat.com Michael Shitrit
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: