Uploaded image for project: 'Red Hat OpenShift Data Science'
  1. Red Hat OpenShift Data Science
  2. RHODS-4364

Downtime during node restart

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • None
    • None
    • None
    • False
    • None
    • False
    • No
    • No
    • No
    • None
    • High

      Description of problem:

      The restart of the node lasted 7 minutes and there were 7 minutes of downtime. It looks like this could happen because the pods of some services are all in the same node, causing a loss of service while the node is restarting. For example, in the test I did, the black-box pods were down, and also when trying to spawn a notebook, the notebook only started when the node started again. 

      Prerequisites (if any, like setup, operators/versions):

      RHODS installed in a cluster with 2 worker nodes

      Steps to Reproduce

      1. Install RHODS
      2. restart a node
      3. Verify availability with rhods_aggregate_availability

      Actual results:

      There is a downtime of 7 minutes

      Expected results:

      There is no downtime

      Reproducibility (Always/Intermittent/Only Once):

      intermittent depends on the node that you're deleting

      Build Details:

      Workaround:

      Additional info:

        1. Screenshot from 2022-06-25 00-15-42.png
          239 kB
          Pablo Felix
        2. Screenshot from 2022-06-25 00-16-42.png
          218 kB
          Pablo Felix
        3. Screenshot from 2022-06-25 00-17-06.png
          196 kB
          Pablo Felix
        4. Screenshot from 2022-06-25 00-21-37.png
          231 kB
          Pablo Felix
        5. Screenshot from 2022-06-25 00-22-44.png
          294 kB
          Pablo Felix
        6. Screenshot from 2022-06-25 00-29-17.png
          280 kB
          Pablo Felix

              Unassigned Unassigned
              pablo-rhods Pablo Felix (Inactive)
              Pablo Felix Pablo Felix (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated: