Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-59947

[DOC]No procedure to reprovision or delete worker nodes on baremetal hosted clusters when Machine Health Checks are disabled.

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Moderate
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      Currently, the documentation does not provide a clear procedure for how to reprovision or delete a worker node in a baremetal hosted cluster when the Machine Health Checks feature is disabled.When Machine Health Checks are active, they automatically handle the replacement of unhealthy nodes. However, in a scenario where this feature is intentionally disabled, there are no documented steps to guide a user through manually replacing or removing a worker node. This can lead to uncertainty and potential issues when a node needs maintenance or removal.
      

      Steps to Reproduce:

      1. Set up a baremetal hosted cluster.
      2. Disable the Machine Health Checks feature for the worker nodes 

      See Disabling machine health checks on non-bare-metal agent machines 

       

      Actual results:

      The documentation is missing instructions for these manual operations when Machine Health Checks are disabled. This forces users to guess the correct procedure, which could risk the stability of the cluster.
      

      Expected results:

      There should be a clear and documented procedure for:
      * Reprovisioning a worker node to replace an existing one.

      Suggested Fix:

      Add a new section to the documentation that provides step-by-step instructions for manually reprovisioning and deleting worker nodes in a baremetal hosted cluster environment where Machine Health Checks are not active. 

      Adding Info:
      " have created the following KCS article for now.
      Worker Node fails to rejoin hosted cluster after repair on Bare Metal Cluster disabling Machine Health Checks

              rhn-support-lahinson Laura Hinson
              rhn-support-rnoma Ryoji Noma
              None
              None
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: