Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-33115

Updates in Backup restore and disaster recovery for hosted control planes

XMLWordPrintable

    • Important
    • No
    • 5
    • OSDOCS Sprint 259, OSDOCS Sprint 260
    • 2
    • False
    • Hide

      None

      Show
      None

      Description of problem:
      there are some steps are not exact in doc section: https://docs.openshift.com/container-platform/4.15/hosted_control_planes/hcp-backup-restore-dr.html#hosted-etcd-non-disruptive-recovery

      In section: Checking the status of a hosted cluster

      1. Enter the running etcd pod that you want to check by entering the following command:
      $ oc rsh -n <control_plane_namespace> -c etcd <etcd_pod_name>

      updated to:

      $ oc rsh -n openshift-etcd etcd <etcd_pod_name>

      2. Set up the etcdctl environment by entering the following commands:

      This step could be deprecated, and execute step 3 as below:

      3. Print the endpoint status for each cluster member by entering the following command:
      sh-4.4# etcdctl endpoint status -w table
      -------------------------------------------------------------------------------------------------------------------------+

      ENDPOINT ID VERSION DB SIZE IS LEADER IS LEARNER RAFT TERM RAFT INDEX RAFT APPLIED INDEX ERRORS

      -------------------------------------------------------------------------------------------------------------------------+

      https://192.168.1xxx.20:2379 8fxxxxxxxxxx 3.5.12 123 MB false false 10 180156 180156  
      https://192.168.1xxx.21:2379 a5xxxxxxxxxx 3.5.12 122 MB false false 10 180156 180156  
      https://192.168.1xxx.22:2379 7cxxxxxxxxx 3.5.12 124 MB true false 10 180156 180156  

      -------------------------------------------------------------------------------------------------------------------------+

      Regarding to setion:

      Recovering an etcd member for a hosted cluster

      1. If you need to confirm that the etcd member is failing, enter the following command:

      $ oc get pods -l app=etcd -n <control_plane_namespace>

      could be updated to:

      $ oc get pods -l app=etcd -n openshift-etcd

      2. Delete the persistent volume claim of the failing etcd member and the pod by entering the following command:

      $ oc delete pvc/<pvc_name> pod/<etcd_pod_name> --wait=false

      the pvc is not available for any etcd pods, so we may change it to another option:

      $ oc delete pods etcd-2 -n openshift-etcd

      and I will update the improvement item in this doc continueously

      How reproducible:

      
      
          Steps to Reproduce:{code:none}
          1.
          2.
          3.
      
          Actual results:{code:none}
      
          
          Expected results:{code:none}
      
          

      Additional info:

      
          

              kowen@redhat.com Kevin Owen
              rhn-support-geliu Ge Liu
              Ge Liu Ge Liu
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: