Uploaded image for project: 'Data Foundation Bugs'
  1. Data Foundation Bugs
  2. DFBUGS-365

[2276204] [RDR] [Hub recovery] [Co-situated] CURRENTSTATE and PROGRESSION takes longer to retain their status post hub recovery

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Critical Critical
    • odf-4.18
    • odf-4.15
    • odf-dr/ramen
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • Committed
    • Committed
    • Release Note Not Required
    • RamenDR sprint 2024 #16, RamenDR sprint 2024 #17, RamenDR sprint 2024 #18, RamenDR sprint 2024 #19, RamenDR sprint 2024 #21
    • None

      Description of problem (please be detailed as possible and provide log
      snippests):

      Version of all relevant components (if applicable):

      ACM 2.10.1 GA'ed
      MCE 2.5.2
      ODF 4.15.1-1
      ceph version 17.2.6-196.el9cp (cbbf2cfb549196ca18c0c9caff9124d83ed681a4) quincy (stable)
      OCP 4.15.0-0.nightly-2024-04-07-120427
      Submariner 0.17.0 GA'ed
      VolSync 0.9.1

      Platform- VMware

      Does this issue impact your ability to continue to work with the product
      (please explain in detail what is the user impact)?

      Is there any workaround available to the best of your knowledge?

      Rate from 1 - 5 the complexity of the scenario you performed that caused this
      bug (1 - very simple, 5 - very complex)?

      Can this issue reproducible?

      Can this issue reproduce from the UI?

      If this is a regression, please provide more details to justify this:

      Steps to Reproduce:
      ****Active hub co-situated with primary managed cluster****

      1. When we have multiple workloads (RBD and CephFS) of both subscription and appset types (pull model) and in different states Deployed, FailedOver, Relocated which were running on primary managed cluster goes down (C1) along with
      active hub during site failure at site-1, perform hub recovery and move to passive hub at site-2 (which is co-situated with secondary managed cluster C2).
      2. Ensure the available managed cluster C2 is successfully imported on the RHACM console of the passive hub, and DRPolicy gets validated.
      2. After DRPC is restored, failover all the workloads to available managed cluster C2.
      3. When failover is successful, recover the down managed cluster C1 and ensure it's successfully cleaned.
      4. Let IOs continue for some time and configure another hub cluster at site-1 to perform hub recovery one more time.
      5. Now relocate all the workloads to the managed cluster C1 (which was recovered post disaster).
      6. Perform hub recovery by bringing current active hub at site-2 and C1 cluster down at site-1.
      7. When moved to new hub at site-1, ensure available managed cluster C2 is successfully imported on the RHACM console of the passive hub, and DRPolicy gets validated.
      8. When drpc is restored, check the CURRENTSTATE and PROGRESSION state of the workloads which were running on down cluster C1 and monitor the time it takes to rebuild the state.

      Actual results: CURRENTSTATE and PROGRESSION takes longer to retain their status post hub recovery

      For step 7, DRPolicy was validated on new hub at site-1 around
      amanagrawal@Amans-MacBook-Pro ~ % date -u
      Sat Apr 20 12:48:46 UTC 2024

      This is the drpc status after that:

      amanagrawal@Amans-MacBook-Pro ~ % while true; date -u; do drpc; echo "*****************************************"; sleep 5; done
      Sat Apr 20 12:50:18 UTC 2024
      NAMESPACE NAME AGE PREFERREDCLUSTER FAILOVERCLUSTER DESIREDSTATE CURRENTSTATE PROGRESSION START TIME DURATION PEER READY
      busybox-workloads-10 rbd-sub-busybox10-placement-1-drpc 4m44s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      busybox-workloads-11 rbd-sub-busybox11-placement-1-drpc 4m44s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      busybox-workloads-12 rbd-sub-busybox12-placement-1-drpc 4m43s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      busybox-workloads-13 cephfs-sub-busybox13-placement-1-drpc 4m43s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      busybox-workloads-14 cephfs-sub-busybox14-placement-1-drpc 4m42s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      busybox-workloads-15 cephfs-sub-busybox15-placement-1-drpc 4m43s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      busybox-workloads-16 cephfs-sub-busybox16-placement-1-drpc 4m44s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      busybox-workloads-23 cephfs-sub-busybox23-placement-1-drpc 4m42s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      busybox-workloads-24 rbd-sub-busybox24-placement-1-drpc 4m42s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      busybox-workloads-27 cephfs-sub-busybox27-placement-1-drpc 4m43s amagrawa-c2-13apr amagrawa-c1-13apr Failover WaitForUser Paused True
      busybox-workloads-28 rbd-sub-busybox28-placement-1-drpc 4m43s amagrawa-c2-13apr amagrawa-c1-13apr Failover WaitForUser Paused True
      busybox-workloads-9 rbd-sub-busybox9-placement-1-drpc 4m43s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      openshift-gitops cephfs-appset-busybox21-placement-drpc 4m43s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      openshift-gitops cephfs-appset-busybox25-placement-drpc 4m43s amagrawa-c2-13apr amagrawa-c1-13apr Failover WaitForUser Paused True
      openshift-gitops cephfs-appset-busybox5-placement-drpc 4m43s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      openshift-gitops cephfs-appset-busybox6-placement-drpc 4m43s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      openshift-gitops cephfs-appset-busybox8-placement-drpc 4m43s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      openshift-gitops rbd-appset-busybox1-placement-drpc 4m43s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      openshift-gitops rbd-appset-busybox2-placement-drpc 4m43s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      openshift-gitops rbd-appset-busybox22-placement-drpc 4m43s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      openshift-gitops rbd-appset-busybox26-placement-drpc 4m42s amagrawa-c2-13apr amagrawa-c1-13apr Failover WaitForUser Paused True
      openshift-gitops rbd-appset-busybox3-placement-drpc 4m42s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      openshift-gitops rbd-appset-busybox4-placement-drpc 4m42s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      *****************************************
      Sat Apr 20 12:50:25 UTC 2024
      NAMESPACE NAME AGE PREFERREDCLUSTER FAILOVERCLUSTER DESIREDSTATE CURRENTSTATE PROGRESSION START TIME DURATION PEER READY
      busybox-workloads-10 rbd-sub-busybox10-placement-1-drpc 4m51s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      busybox-workloads-11 rbd-sub-busybox11-placement-1-drpc 4m51s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      busybox-workloads-12 rbd-sub-busybox12-placement-1-drpc 4m50s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      busybox-workloads-13 cephfs-sub-busybox13-placement-1-drpc 4m50s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      busybox-workloads-14 cephfs-sub-busybox14-placement-1-drpc 4m49s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      busybox-workloads-15 cephfs-sub-busybox15-placement-1-drpc 4m50s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      busybox-workloads-16 cephfs-sub-busybox16-placement-1-drpc 4m51s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      busybox-workloads-23 cephfs-sub-busybox23-placement-1-drpc 4m49s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      busybox-workloads-24 rbd-sub-busybox24-placement-1-drpc 4m49s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      busybox-workloads-27 cephfs-sub-busybox27-placement-1-drpc 4m50s amagrawa-c2-13apr amagrawa-c1-13apr Failover WaitForUser Paused True
      busybox-workloads-28 rbd-sub-busybox28-placement-1-drpc 4m50s amagrawa-c2-13apr amagrawa-c1-13apr Failover WaitForUser Paused True
      busybox-workloads-9 rbd-sub-busybox9-placement-1-drpc 4m50s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      openshift-gitops cephfs-appset-busybox21-placement-drpc 4m50s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      openshift-gitops cephfs-appset-busybox25-placement-drpc 4m50s amagrawa-c2-13apr amagrawa-c1-13apr Failover WaitForUser Paused True
      openshift-gitops cephfs-appset-busybox5-placement-drpc 4m50s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      openshift-gitops cephfs-appset-busybox6-placement-drpc 4m50s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      openshift-gitops cephfs-appset-busybox8-placement-drpc 4m50s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      openshift-gitops rbd-appset-busybox1-placement-drpc 4m50s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      openshift-gitops rbd-appset-busybox2-placement-drpc 4m50s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      openshift-gitops rbd-appset-busybox22-placement-drpc 4m50s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      openshift-gitops rbd-appset-busybox26-placement-drpc 4m49s amagrawa-c2-13apr amagrawa-c1-13apr Failover WaitForUser Paused True
      openshift-gitops rbd-appset-busybox3-placement-drpc 4m49s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      openshift-gitops rbd-appset-busybox4-placement-drpc 4m49s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      *****************************************

      ..
      ..
      ..

      ..
      ..
      ..

      *****************************************
      Sat Apr 20 12:55:02 UTC 2024
      NAMESPACE NAME AGE PREFERREDCLUSTER FAILOVERCLUSTER DESIREDSTATE CURRENTSTATE PROGRESSION START TIME DURATION PEER READY
      busybox-workloads-10 rbd-sub-busybox10-placement-1-drpc 9m28s amagrawa-c1-13apr amagrawa-c2-13apr Relocate WaitForUser Paused True
      busybox-workloads-11 rbd-sub-busybox11-placement-1-drpc 9m28s amagrawa-c1-13apr amagrawa-c2-13apr Relocate WaitForUser Paused True
      busybox-workloads-12 rbd-sub-busybox12-placement-1-drpc 9m27s amagrawa-c1-13apr amagrawa-c2-13apr Relocate WaitForUser Paused True
      busybox-workloads-13 cephfs-sub-busybox13-placement-1-drpc 9m27s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      busybox-workloads-14 cephfs-sub-busybox14-placement-1-drpc 9m26s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      busybox-workloads-15 cephfs-sub-busybox15-placement-1-drpc 9m27s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      busybox-workloads-16 cephfs-sub-busybox16-placement-1-drpc 9m28s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      busybox-workloads-23 cephfs-sub-busybox23-placement-1-drpc 9m26s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      busybox-workloads-24 rbd-sub-busybox24-placement-1-drpc 9m26s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      busybox-workloads-27 cephfs-sub-busybox27-placement-1-drpc 9m27s amagrawa-c2-13apr amagrawa-c1-13apr Failover WaitForUser Paused True
      busybox-workloads-28 rbd-sub-busybox28-placement-1-drpc 9m27s amagrawa-c2-13apr amagrawa-c1-13apr Failover WaitForUser Paused True
      busybox-workloads-9 rbd-sub-busybox9-placement-1-drpc 9m27s amagrawa-c1-13apr amagrawa-c2-13apr Failover WaitForUser Paused True
      openshift-gitops cephfs-appset-busybox21-placement-drpc 9m27s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      openshift-gitops cephfs-appset-busybox25-placement-drpc 9m27s amagrawa-c2-13apr amagrawa-c1-13apr Failover WaitForUser Paused True
      openshift-gitops cephfs-appset-busybox5-placement-drpc 9m27s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      openshift-gitops cephfs-appset-busybox6-placement-drpc 9m27s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      openshift-gitops cephfs-appset-busybox8-placement-drpc 9m27s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      openshift-gitops rbd-appset-busybox1-placement-drpc 9m27s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      openshift-gitops rbd-appset-busybox2-placement-drpc 9m27s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      openshift-gitops rbd-appset-busybox22-placement-drpc 9m27s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      openshift-gitops rbd-appset-busybox26-placement-drpc 9m26s amagrawa-c2-13apr amagrawa-c1-13apr Failover WaitForUser Paused True
      openshift-gitops rbd-appset-busybox3-placement-drpc 9m26s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      openshift-gitops rbd-appset-busybox4-placement-drpc 9m26s amagrawa-c1-13apr amagrawa-c2-13apr Relocate True
      *****************************************

      Even after more than 5 minutes, CURRENTSTATE and PROGRESSION is empty for most of the workloads however they can still be failedover via UI to the secondary managed cluster C2 (so it doesn't impact the functionality).

      Expected results: Improve time taken to retain/rebuild CURRENTSTATE and PROGRESSION status for inaccessible workloads post hub recovery

      Additional info:

              bmekhiss Benamar Mekhissi
              amagrawa@redhat.com Aman Agrawal
              Benamar Mekhissi
              Aman Agrawal Aman Agrawal
              Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

                Created:
                Updated:
                Resolved: