Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-4628

Openshift4.10 cluster went to bad state while performing node replacement procedure

XMLWordPrintable

    • Important
    • None
    • Rejected
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      oc commands fails during node replacement procedure on 3 node master+slave cluster deployed via UPI. oc commands fails post drain and deleting powered down node from cluster
      
      

      Version-Release number of selected component (if applicable):

      4.10.9
      
      

      How reproducible:

      100%
      
      

      Steps to Reproduce:

      1. Install RHOCP cluster (3 node cluster, masters are schedulable)
      2. Install ODF and deploy application which uses the PV provisioned by ODF.
      3. Follow below documentation after powering off one of the master node.
        [-] https://docs.openshift.com/container-platform/4.10/backup_and_restore/control_plane_backup_and_restore/replacing-unhealthy-etcd-member.html#restore-replace-stopped-baremetal-etcd-member_replacing-unhealthy-etcd-member
      4. After performing "oc delete node <name>", the oc commands continue to HANG and never recovers. Kube-apiserver keeps on crashing on running nodes (the 2 nodes which were untouched)
      
      

      Actual results:

      oc command starts to hang and kube-apiserver container keeps on restarting on the nodes which were untouched.
      
      

      Expected results:

      oc command should work fine.
      
      

      Additional info:

      
      

              Unassigned Unassigned
              rhn-support-dgautam Dhruv Gautam
              Rahul Gangwar Rahul Gangwar
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated:
                Resolved: