Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-60533

cold boot test one node ungraceful, other graceful results in unresponsive cluster

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Major Major
    • None
    • 4.20
    • Two Node Fencing
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • Rejected
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      A cold boot test where one node is brought down in an ungraceful manner, the other node is gracefully shutdown. Once the virsh list command verifies the VMs are no longer running, I start the VMs from shut off state. 
      
      [core@master-1 ~]$ sudo pcs resource status
        * Clone Set: kubelet-clone [kubelet]:
          * Started: [ master-0 master-1 ]
        * Clone Set: etcd-clone [etcd]:
          * Stopped: [ master-0 master-1 ]
      

      Version-Release number of selected component (if applicable):

          

      How reproducible:

      95%  

      Steps to Reproduce:

       1. Deploy a TNF cluster using TNT
       2. Power each node using virsh destroy <node name>
       3. Wait for the nodes to no longer be un running state 
             (virsh list --all)
       4. Start each node
       5. Check the running state of pcs
          

      Actual results:

          the etcd-podman does not seem to start, so etcd does not start

      Expected results:

          The cluster should be recreated, and should be in working order.

      Additional info:

          

              rh-ee-clobrano Carlo Lobrano
              rh-ee-dhensel Douglas Hensel
              None
              None
              Douglas Hensel Douglas Hensel
              None
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: