Uploaded image for project: 'Data Foundation Bugs'
  1. Data Foundation Bugs
  2. DFBUGS-331

[2308801] [RDR] [Hub recovery] [Neutral] When the backed-up state is not latest, deployment/pod is lost for the apps in relocated state

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False
    • 4.17.1
    • ?
    • 4.17.0-103
    • ?
    • RamenDR sprint 2024 #16, RamenDR sprint 2024 #17
    • None

      Description of problem (please be detailed as possible and provide log
      snippests):

      Version of all relevant components (if applicable):
      OCP 4.16.0-0.nightly-2024-08-29-060830
      ODF 4.16.1-8
      ACM 2.11.2 GA'ed
      OADP 1.4.0
      MCE 2.6.2
      RH Gitops 1.13.1
      Submariner 0.18.0
      VolSync 0.10.0
      ceph version 18.2.1-229.el9cp (ef652b206f2487adfc86613646a4cac946f6b4e0) reef (stable)

      Does this issue impact your ability to continue to work with the product
      (please explain in detail what is the user impact)?

      Is there any workaround available to the best of your knowledge?

      Rate from 1 - 5 the complexity of the scenario you performed that caused this
      bug (1 - very simple, 5 - very complex)?

      Can this issue reproducible?

      Can this issue reproduce from the UI?

      If this is a regression, please provide more details to justify this:

      Steps to Reproduce:
      1. On a RDR setup with multiple workloads, rbd-appset(pull)/sub, cephfs-appset(pull)/sub all in Deployed, FailedOver and Relocated state running on any one of the managed clusters and rbd-appset(pull)/sub, cephfs-appset(pull)/sub 1 each in Deployed state on another managed cluster,then configure it for hub recovery but do not start taking new backups.
      2. Before backups are taken, ensure the above state is achieved.
      3. Now start taking backups and when we have 1 or 2 successful backups, either stop the backup or increase the backup time to allow certain action in between so that no new backup is taken.
      Collect outputs and note down other observations.
      4. Now failover/relocate the workloads which are already in FailedOver or Relocated state to another cluster.
      Meaning, move the workloads which are primary on C1 to C2 or the other way round. Let the workloads in Deployed state remain as it is on both the managed clusters.
      5. Make sure that the latest state of workloads and drpc is *NOT* backed up as mentioned in Step 3 above. We do not want to latest backups to be taken.

      Collect outputs and note down other observations.

      After all the operations complete, let IOs run for some time and then
      perform hub recovery by bringing active hub cluster down.

      6. After moving to new hub, ensure drpolicy is validated and drpc is restored.
      7. Check the drpc status (it should match with the last backed up state of drpc as in Step 3 above) before we stopped taking backups.
      8. Now workloads will try to move to different managed cluster as per drpc state which is restored. Apps in relocated state will waitforuser action.
      9. Relocate all such apps via ACM UI

      Actual results:

      ================================================================================================================================================================
      DRPC state when backup was taken:

      At date -u
      Fri Aug 30 10:26:25 UTC 2024

      Backups on active hub

      backup
      NAMESPACE NAME AGE
      open-cluster-management-backup acm-credentials-schedule-20240830101055 14m
      open-cluster-management-backup acm-credentials-schedule-20240830101554 9m34s
      open-cluster-management-backup acm-credentials-schedule-20240830102054 4m34s
      open-cluster-management-backup acm-managed-clusters-schedule-20240830101055 14m
      open-cluster-management-backup acm-managed-clusters-schedule-20240830101554 9m34s
      open-cluster-management-backup acm-managed-clusters-schedule-20240830102054 4m34s
      open-cluster-management-backup acm-resources-generic-schedule-20240830101055 14m
      open-cluster-management-backup acm-resources-generic-schedule-20240830101554 9m34s
      open-cluster-management-backup acm-resources-generic-schedule-20240830102054 4m34s
      open-cluster-management-backup acm-resources-schedule-20240830101055 14m
      open-cluster-management-backup acm-resources-schedule-20240830101554 9m34s
      open-cluster-management-backup acm-resources-schedule-20240830102054 4m34s
      open-cluster-management-backup acm-validation-policy-schedule-20240830101554 9m34s
      open-cluster-management-backup acm-validation-policy-schedule-20240830102054 4m34s
      ////////////
      NAME PHASE MESSAGE
      schedule-acm Enabled Velero schedules are enabled
      ////////////
      NAME REMEDIATION ACTION COMPLIANCE STATE AGE
      backup-restore-enabled inform Compliant 2d12h
      ////////////
      NAME PHASE LAST VALIDATED AGE DEFAULT
      default Available 43s 2d12h true
      ////////////
      NAME STATUS SCHEDULE LASTBACKUP AGE PAUSED
      acm-credentials-schedule Enabled 0 */99 * * * 4m38s 14m
      acm-managed-clusters-schedule Enabled 0 */99 * * * 4m38s 14m
      acm-resources-generic-schedule Enabled 0 */99 * * * 4m38s 14m
      acm-resources-schedule Enabled 0 */99 * * * 4m38s 14m
      acm-validation-policy-schedule Enabled 0 */99 * * * 4m38s 14m

      Backups on passive hub

      backup
      NAMESPACE NAME AGE
      open-cluster-management-backup acm-credentials-schedule-20240830101055 15m
      open-cluster-management-backup acm-credentials-schedule-20240830101554 10m
      open-cluster-management-backup acm-credentials-schedule-20240830102054 5m6s
      open-cluster-management-backup acm-managed-clusters-schedule-20240830101055 15m
      open-cluster-management-backup acm-managed-clusters-schedule-20240830101554 10m
      open-cluster-management-backup acm-managed-clusters-schedule-20240830102054 5m5s
      open-cluster-management-backup acm-resources-generic-schedule-20240830101055 13m
      open-cluster-management-backup acm-resources-generic-schedule-20240830101554 8m6s
      open-cluster-management-backup acm-resources-generic-schedule-20240830102054 3m6s
      open-cluster-management-backup acm-resources-schedule-20240830101055 13m
      open-cluster-management-backup acm-resources-schedule-20240830101554 8m5s
      open-cluster-management-backup acm-resources-schedule-20240830102054 3m5s
      open-cluster-management-backup acm-validation-policy-schedule-20240830101554 8m5s
      open-cluster-management-backup acm-validation-policy-schedule-20240830102054 3m5s
      ////////////
      No resources found in open-cluster-management-backup namespace.
      ////////////
      NAME REMEDIATION ACTION COMPLIANCE STATE AGE
      backup-restore-enabled inform Compliant 16h
      ////////////
      NAME PHASE LAST VALIDATED AGE DEFAULT
      default Available 10s 3h12m true
      ////////////
      No resources found in open-cluster-management-backup namespace.

      DRPC at active hub

      drpc
      ////////////////////////////////
      Fri Aug 30 10:26:49 UTC 2024
      *******
      NAMESPACE NAME AGE PREFERREDCLUSTER FAILOVERCLUSTER DESIREDSTATE CURRENTSTATE PROGRESSION START TIME DURATION PEER READY
      busybox-workloads-10 rbd-sub-busybox10-placement-1-drpc 38h amagrawa-c2-28aug amagrawa-c1-28aug Relocate Relocated Completed 2024-08-29T11:43:31Z 4m7.14536939s True
      busybox-workloads-11 rbd-sub-busybox11-placement-1-drpc 38h amagrawa-c1-28aug Deployed Completed 2024-08-28T19:35:56Z 16.037619869s True
      busybox-workloads-12 rbd-sub-busybox12-placement-1-drpc 38h amagrawa-c2-28aug Deployed Completed 2024-08-28T19:36:52Z 22.031782646s True
      busybox-workloads-13 cephfs-sub-busybox13-placement-1-drpc 38h amagrawa-c1-28aug amagrawa-c2-28aug Failover FailedOver Completed 2024-08-29T11:45:04Z 2m39.777750435s True
      busybox-workloads-14 cephfs-sub-busybox14-placement-1-drpc 38h amagrawa-c2-28aug amagrawa-c1-28aug Relocate Relocated Completed 2024-08-29T11:45:13Z 3m27.543350385s True
      busybox-workloads-15 cephfs-sub-busybox15-placement-1-drpc 38h amagrawa-c1-28aug Deployed Completed 2024-08-28T19:49:28Z 37.092845577s True
      busybox-workloads-16 cephfs-sub-busybox16-placement-1-drpc 38h amagrawa-c2-28aug Deployed Completed 2024-08-28T19:50:18Z 43.102846191s True
      busybox-workloads-9 rbd-sub-busybox9-placement-1-drpc 38h amagrawa-c1-28aug amagrawa-c2-28aug Failover FailedOver Completed 2024-08-29T11:43:17Z 4m40.034887668s True
      openshift-gitops cephfs-appset-busybox5-placement-drpc 42h amagrawa-c1-28aug amagrawa-c2-28aug Failover FailedOver Completed 2024-08-29T11:44:45Z 2m48.484996609s True
      openshift-gitops cephfs-appset-busybox6-placement-drpc 42h amagrawa-c2-28aug amagrawa-c1-28aug Relocate Relocated Completed 2024-08-29T11:44:50Z 5m20.161782009s True
      openshift-gitops cephfs-appset-busybox7-placement-drpc 42h amagrawa-c1-28aug Deployed Completed 2024-08-28T16:15:35Z 45.117386476s True
      openshift-gitops cephfs-appset-busybox8-placement-drpc 42h amagrawa-c2-28aug Deployed Completed 2024-08-28T16:16:28Z 48.103711373s True
      openshift-gitops rbd-appset-busybox1-placement-drpc 42h amagrawa-c1-28aug amagrawa-c2-28aug Failover FailedOver Completed 2024-08-29T11:42:58Z 5m21.790526304s True
      openshift-gitops rbd-appset-busybox2-placement-drpc 42h amagrawa-c2-28aug amagrawa-c1-28aug Relocate Relocated Completed 2024-08-29T11:43:05Z 6m34.107132984s True
      openshift-gitops rbd-appset-busybox3-placement-drpc 42h amagrawa-c1-28aug Deployed Completed 2024-08-28T16:11:39Z 5.045949912s True
      openshift-gitops rbd-appset-busybox4-placement-drpc 42h amagrawa-c2-28aug Deployed Completed 2024-08-28T16:12:36Z 1.042272661s True

      ================================================================================================================================================================
      DRPC state after backup was stopped:

      From active hub-

      drpc
      ////////////////////////////////
      Fri Aug 30 17:12:07 UTC 2024
      *******
      NAMESPACE NAME AGE PREFERREDCLUSTER FAILOVERCLUSTER DESIREDSTATE CURRENTSTATE PROGRESSION START TIME DURATION PEER READY
      busybox-workloads-10 rbd-sub-busybox10-placement-1-drpc 45h amagrawa-c1-28aug amagrawa-c2-28aug Relocate Relocated Completed 2024-08-30T10:33:03Z 4m31.136839895s True
      busybox-workloads-11 rbd-sub-busybox11-placement-1-drpc 45h amagrawa-c1-28aug Deployed Completed 2024-08-28T19:35:56Z 16.037619869s True
      busybox-workloads-12 rbd-sub-busybox12-placement-1-drpc 45h amagrawa-c2-28aug Deployed Completed 2024-08-28T19:36:52Z 22.031782646s True
      busybox-workloads-13 cephfs-sub-busybox13-placement-1-drpc 45h amagrawa-c1-28aug amagrawa-c2-28aug Relocate Relocated Completed 2024-08-30T10:31:41Z 3m34.276720598s True
      busybox-workloads-14 cephfs-sub-busybox14-placement-1-drpc 45h amagrawa-c2-28aug amagrawa-c1-28aug Failover FailedOver Completed 2024-08-30T10:31:49Z 3m11.99710934s True
      busybox-workloads-15 cephfs-sub-busybox15-placement-1-drpc 45h amagrawa-c1-28aug Deployed Completed 2024-08-28T19:49:28Z 37.092845577s True
      busybox-workloads-16 cephfs-sub-busybox16-placement-1-drpc 45h amagrawa-c2-28aug Deployed Completed 2024-08-28T19:50:18Z 43.102846191s True
      busybox-workloads-9 rbd-sub-busybox9-placement-1-drpc 45h amagrawa-c2-28aug amagrawa-c1-28aug Failover FailedOver Completed 2024-08-30T10:32:55Z 4m25.040487678s True
      openshift-gitops cephfs-appset-busybox5-placement-drpc 2d amagrawa-c2-28aug amagrawa-c1-28aug Failover FailedOver Completed 2024-08-30T10:32:11Z 3m9.894131846s True
      openshift-gitops cephfs-appset-busybox6-placement-drpc 2d amagrawa-c1-28aug amagrawa-c2-28aug Relocate Relocated Completed 2024-08-30T10:32:18Z 3m17.106363595s True
      openshift-gitops cephfs-appset-busybox7-placement-drpc 2d amagrawa-c1-28aug Deployed Completed 2024-08-28T16:15:35Z 45.117386476s True
      openshift-gitops cephfs-appset-busybox8-placement-drpc 2d amagrawa-c2-28aug Deployed Completed 2024-08-28T16:16:28Z 48.103711373s True
      openshift-gitops rbd-appset-busybox1-placement-drpc 2d1h amagrawa-c1-28aug amagrawa-c2-28aug Relocate Relocated Completed 2024-08-30T10:33:24Z 5m26.300788158s True
      openshift-gitops rbd-appset-busybox2-placement-drpc 2d1h amagrawa-c2-28aug amagrawa-c1-28aug Failover FailedOver Completed 2024-08-30T10:33:32Z 4m48.836673355s True
      openshift-gitops rbd-appset-busybox3-placement-drpc 2d1h amagrawa-c1-28aug Deployed Completed 2024-08-28T16:11:39Z 5.045949912s True
      openshift-gitops rbd-appset-busybox4-placement-drpc 2d1h amagrawa-c2-28aug Deployed Completed 2024-08-28T16:12:36Z 1.042272661s True

      group
      ******************************
      Fri Aug 30 17:12:12 UTC 2024
      *******
      drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-10
      namespace: busybox-workloads-10
      namespace: busybox-workloads-10
      lastGroupSyncTime: "2024-08-30T17:10:00Z"
      namespace: busybox-workloads-10
      drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-11
      namespace: busybox-workloads-11
      namespace: busybox-workloads-11
      lastGroupSyncTime: "2024-08-30T17:10:00Z"
      namespace: busybox-workloads-11
      drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-12
      namespace: busybox-workloads-12
      namespace: busybox-workloads-12
      lastGroupSyncTime: "2024-08-30T17:10:00Z"
      namespace: busybox-workloads-12
      drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-13
      namespace: busybox-workloads-13
      namespace: busybox-workloads-13
      lastGroupSyncTime: "2024-08-30T17:10:45Z"
      namespace: busybox-workloads-13
      drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-14
      namespace: busybox-workloads-14
      namespace: busybox-workloads-14
      lastGroupSyncTime: "2024-08-30T17:10:45Z"
      namespace: busybox-workloads-14
      drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-15
      namespace: busybox-workloads-15
      namespace: busybox-workloads-15
      lastGroupSyncTime: "2024-08-30T17:10:57Z"
      namespace: busybox-workloads-15
      drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-16
      namespace: busybox-workloads-16
      namespace: busybox-workloads-16
      lastGroupSyncTime: "2024-08-30T17:11:05Z"
      namespace: busybox-workloads-16
      drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-9
      namespace: busybox-workloads-9
      namespace: busybox-workloads-9
      lastGroupSyncTime: "2024-08-30T17:10:00Z"
      namespace: busybox-workloads-9
      drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-5
      namespace: openshift-gitops
      namespace: openshift-gitops
      lastGroupSyncTime: "2024-08-30T17:10:46Z"
      namespace: busybox-workloads-5
      drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-6
      namespace: openshift-gitops
      namespace: openshift-gitops
      lastGroupSyncTime: "2024-08-30T17:10:48Z"
      namespace: busybox-workloads-6
      drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-7
      namespace: openshift-gitops
      namespace: openshift-gitops
      lastGroupSyncTime: "2024-08-30T17:11:02Z"
      namespace: busybox-workloads-7
      drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-8
      namespace: openshift-gitops
      namespace: openshift-gitops
      lastGroupSyncTime: "2024-08-30T17:11:13Z"
      namespace: busybox-workloads-8
      drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-1
      namespace: openshift-gitops
      namespace: openshift-gitops
      lastGroupSyncTime: "2024-08-30T17:10:00Z"
      namespace: busybox-workloads-1
      drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-2
      namespace: openshift-gitops
      namespace: openshift-gitops
      lastGroupSyncTime: "2024-08-30T17:10:00Z"
      namespace: busybox-workloads-2
      drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-3
      namespace: openshift-gitops
      namespace: openshift-gitops
      lastGroupSyncTime: "2024-08-30T17:10:00Z"
      namespace: busybox-workloads-3
      drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-4
      namespace: openshift-gitops
      namespace: openshift-gitops
      lastGroupSyncTime: "2024-08-30T17:10:00Z"
      namespace: busybox-workloads-4

      date -u
      Fri Aug 30 17:12:15 UTC 2024

      backup
      NAMESPACE NAME AGE
      open-cluster-management-backup acm-credentials-schedule-20240830101055 7h1m
      open-cluster-management-backup acm-credentials-schedule-20240830101554 6h56m
      open-cluster-management-backup acm-credentials-schedule-20240830102054 6h51m
      open-cluster-management-backup acm-managed-clusters-schedule-20240830101055 7h1m
      open-cluster-management-backup acm-managed-clusters-schedule-20240830101554 6h56m
      open-cluster-management-backup acm-managed-clusters-schedule-20240830102054 6h51m
      open-cluster-management-backup acm-resources-generic-schedule-20240830101055 7h1m
      open-cluster-management-backup acm-resources-generic-schedule-20240830101554 6h56m
      open-cluster-management-backup acm-resources-generic-schedule-20240830102054 6h51m
      open-cluster-management-backup acm-resources-schedule-20240830101055 7h1m
      open-cluster-management-backup acm-resources-schedule-20240830101554 6h56m
      open-cluster-management-backup acm-resources-schedule-20240830102054 6h51m
      ////////////
      NAME PHASE MESSAGE
      schedule-acm Enabled Velero schedules are enabled
      ////////////
      NAME REMEDIATION ACTION COMPLIANCE STATE AGE
      backup-restore-enabled inform NonCompliant 2d19h
      ////////////
      NAME PHASE LAST VALIDATED AGE DEFAULT
      default Available 37s 2d19h true
      ////////////
      NAME STATUS SCHEDULE LASTBACKUP AGE PAUSED
      acm-credentials-schedule Enabled 0 */99 * * * 6h51m 7h1m
      acm-managed-clusters-schedule Enabled 0 */99 * * * 6h51m 7h1m
      acm-resources-generic-schedule Enabled 0 */99 * * * 6h51m 7h1m
      acm-resources-schedule Enabled 0 */99 * * * 6h51m 7h1m
      acm-validation-policy-schedule Enabled 0 */99 * * * 6h51m 7h1m

      From passive hub-

      backup
      NAMESPACE NAME AGE
      open-cluster-management-backup acm-credentials-schedule-20240830101055 7h1m
      open-cluster-management-backup acm-credentials-schedule-20240830101554 6h56m
      open-cluster-management-backup acm-credentials-schedule-20240830102054 6h51m
      open-cluster-management-backup acm-managed-clusters-schedule-20240830101055 7h1m
      open-cluster-management-backup acm-managed-clusters-schedule-20240830101554 6h56m
      open-cluster-management-backup acm-managed-clusters-schedule-20240830102054 6h51m
      open-cluster-management-backup acm-resources-generic-schedule-20240830101055 6h59m
      open-cluster-management-backup acm-resources-generic-schedule-20240830101554 6h54m
      open-cluster-management-backup acm-resources-generic-schedule-20240830102054 6h49m
      open-cluster-management-backup acm-resources-schedule-20240830101055 6h59m
      open-cluster-management-backup acm-resources-schedule-20240830101554 6h54m
      open-cluster-management-backup acm-resources-schedule-20240830102054 6h49m
      ////////////
      No resources found in open-cluster-management-backup namespace.
      ////////////
      NAME REMEDIATION ACTION COMPLIANCE STATE AGE
      backup-restore-enabled inform NonCompliant 23h
      ////////////
      NAME PHASE LAST VALIDATED AGE DEFAULT
      default Available 22s 9h true
      ////////////
      No resources found in open-cluster-management-backup namespace.

      ================================================================================
      Active hub was brought down around Fri Aug 30 17:14:12 UTC 2024
      ================================================================================

      ================================================================================
      Restored backups on passive hub at around date -u
      Fri Aug 30 17:23:02 UTC 2024
      ================================================================================

      ================================================================================
      DRpolicy got validated at about date -u
      Fri Aug 30 17:24:28 UTC 2024
      ================================================================================

      DRPC state after hub recovery:

      From new active hub-

      drpc
      ////////////////////////////////
      Fri Aug 30 21:51:31 UTC 2024
      *******
      NAMESPACE NAME AGE PREFERREDCLUSTER FAILOVERCLUSTER DESIREDSTATE CURRENTSTATE PROGRESSION START TIME DURATION PEER READY
      busybox-workloads-10 rbd-sub-busybox10-placement-1-drpc 4h30m amagrawa-c2-28aug amagrawa-c1-28aug Relocate WaitForUser Paused True
      busybox-workloads-11 rbd-sub-busybox11-placement-1-drpc 4h30m amagrawa-c1-28aug Deployed Completed 2024-08-30T17:24:36Z 898.25644ms True
      busybox-workloads-12 rbd-sub-busybox12-placement-1-drpc 4h30m amagrawa-c2-28aug Deployed Completed 2024-08-30T17:24:59Z 997.389937ms True
      busybox-workloads-13 cephfs-sub-busybox13-placement-1-drpc 4h30m amagrawa-c1-28aug amagrawa-c2-28aug Failover FailedOver Completed 2024-08-30T17:24:59Z 2m37.256839681s True
      busybox-workloads-14 cephfs-sub-busybox14-placement-1-drpc 4h30m amagrawa-c2-28aug amagrawa-c1-28aug Relocate WaitForUser Paused True
      busybox-workloads-15 cephfs-sub-busybox15-placement-1-drpc 4h30m amagrawa-c1-28aug Deployed Completed 2024-08-30T17:24:59Z 1.797831304s True
      busybox-workloads-16 cephfs-sub-busybox16-placement-1-drpc 4h30m amagrawa-c2-28aug Deployed Completed 2024-08-30T17:24:59Z 1.297596905s True
      busybox-workloads-9 rbd-sub-busybox9-placement-1-drpc 4h30m amagrawa-c1-28aug amagrawa-c2-28aug Failover FailedOver Completed 2024-08-30T17:24:37Z 1h4m2.552357112s True
      openshift-gitops cephfs-appset-busybox5-placement-drpc 4h30m amagrawa-c1-28aug amagrawa-c2-28aug Failover FailedOver Completed 2024-08-30T17:25:00Z 2m36.525322643s True
      openshift-gitops cephfs-appset-busybox6-placement-drpc 4h30m amagrawa-c2-28aug amagrawa-c1-28aug Relocate WaitForUser Paused True
      openshift-gitops cephfs-appset-busybox7-placement-drpc 4h30m amagrawa-c1-28aug Deployed Completed 2024-08-30T17:24:59Z 996.947009ms True
      openshift-gitops cephfs-appset-busybox8-placement-drpc 4h30m amagrawa-c2-28aug Deployed Completed 2024-08-30T17:25:02Z 597.880944ms True
      openshift-gitops rbd-appset-busybox1-placement-drpc 4h30m amagrawa-c1-28aug amagrawa-c2-28aug Failover FailedOver Completed 2024-08-30T17:24:36Z 1h3m32.768428942s True
      openshift-gitops rbd-appset-busybox2-placement-drpc 4h30m amagrawa-c2-28aug amagrawa-c1-28aug Relocate WaitForUser Paused True
      openshift-gitops rbd-appset-busybox3-placement-drpc 4h30m amagrawa-c1-28aug Deployed Completed 2024-08-30T17:24:36Z 297.132956ms True
      openshift-gitops rbd-appset-busybox4-placement-drpc 4h30m amagrawa-c2-28aug Deployed Completed 2024-08-30T17:24:59Z 1.498228814s True

      ================================================================================

      Then I relocated below apps via ACM UI around Fri Aug 30 21:53:30 UTC 2024

      (This is after evictiontimeout period of 1hr passed)

      rbd-sub-busybox10-placement-1-drpc
      cephfs-sub-busybox14-placement-1-drpc
      cephfs-appset-busybox6-placement-drpc
      rbd-appset-busybox2-placement-drpc

      DRPC state after relocate-

      drpc
      ////////////////////////////////
      Fri Aug 30 22:14:38 UTC 2024
      *******
      NAMESPACE NAME AGE PREFERREDCLUSTER FAILOVERCLUSTER DESIREDSTATE CURRENTSTATE PROGRESSION START TIME DURATION PEER READY
      busybox-workloads-10 rbd-sub-busybox10-placement-1-drpc 4h53m amagrawa-c1-28aug amagrawa-c2-28aug Relocate Relocated Completed True
      busybox-workloads-11 rbd-sub-busybox11-placement-1-drpc 4h53m amagrawa-c1-28aug Deployed Completed 2024-08-30T17:24:36Z 898.25644ms True
      busybox-workloads-12 rbd-sub-busybox12-placement-1-drpc 4h53m amagrawa-c2-28aug Deployed Completed 2024-08-30T17:24:59Z 997.389937ms True
      busybox-workloads-13 cephfs-sub-busybox13-placement-1-drpc 4h53m amagrawa-c1-28aug amagrawa-c2-28aug Failover FailedOver Completed 2024-08-30T17:24:59Z 2m37.256839681s True
      busybox-workloads-14 cephfs-sub-busybox14-placement-1-drpc 4h53m amagrawa-c1-28aug amagrawa-c2-28aug Relocate WaitForUser Paused True
      busybox-workloads-15 cephfs-sub-busybox15-placement-1-drpc 4h53m amagrawa-c1-28aug Deployed Completed 2024-08-30T17:24:59Z 1.797831304s True
      busybox-workloads-16 cephfs-sub-busybox16-placement-1-drpc 4h53m amagrawa-c2-28aug Deployed Completed 2024-08-30T17:24:59Z 1.297596905s True
      busybox-workloads-9 rbd-sub-busybox9-placement-1-drpc 4h53m amagrawa-c1-28aug amagrawa-c2-28aug Failover FailedOver Completed 2024-08-30T17:24:37Z 1h4m2.552357112s True
      openshift-gitops cephfs-appset-busybox5-placement-drpc 4h53m amagrawa-c1-28aug amagrawa-c2-28aug Failover FailedOver Completed 2024-08-30T17:25:00Z 2m36.525322643s True
      openshift-gitops cephfs-appset-busybox6-placement-drpc 4h53m amagrawa-c2-28aug amagrawa-c1-28aug Relocate WaitForUser Paused True
      openshift-gitops cephfs-appset-busybox7-placement-drpc 4h53m amagrawa-c1-28aug Deployed Completed 2024-08-30T17:24:59Z 996.947009ms True
      openshift-gitops cephfs-appset-busybox8-placement-drpc 4h53m amagrawa-c2-28aug Deployed Completed 2024-08-30T17:25:02Z 597.880944ms True
      openshift-gitops rbd-appset-busybox1-placement-drpc 4h53m amagrawa-c1-28aug amagrawa-c2-28aug Failover FailedOver Completed 2024-08-30T17:24:36Z 1h3m32.768428942s True
      openshift-gitops rbd-appset-busybox2-placement-drpc 4h53m amagrawa-c1-28aug amagrawa-c2-28aug Relocate WaitForUser Paused True
      openshift-gitops rbd-appset-busybox3-placement-drpc 4h53m amagrawa-c1-28aug Deployed Completed 2024-08-30T17:24:36Z 297.132956ms True
      openshift-gitops rbd-appset-busybox4-placement-drpc 4h53m amagrawa-c2-28aug Deployed Completed 2024-08-30T17:24:59Z 1.498228814s True

      ================================================================================================================================================================

      There was no change in the drpc state for workloads
      cephfs-sub-busybox14-placement-1-drpc
      cephfs-appset-busybox6-placement-drpc
      rbd-appset-busybox2-placement-drpc

      For rbd-sub-busybox10-placement-1-drpc, Relocate was marked as Completed but the workload still had issues.

      NAMESPACE NAME AGE PREFERREDCLUSTER FAILOVERCLUSTER DESIREDSTATE CURRENTSTATE PROGRESSION START TIME DURATION PEER READY
      busybox-workloads-10 rbd-sub-busybox10-placement-1-drpc 4h53m amagrawa-c1-28aug amagrawa-c2-28aug Relocate Relocated Completed True

      C1-

      busybox-10
      Already on project "busybox-workloads-10" on server "https://api.amagrawa-c1-28aug.qe.rh-ocs.com:6443".
      NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE VOLUMEMODE
      persistentvolumeclaim/busybox-pvc-41 Terminating pvc-6c950553-71ad-4eb6-ac17-7ada3c5a7c2c 42Gi RWO ocs-storagecluster-ceph-rbd <unset> 22h Filesystem

      NAME AGE VOLUMEREPLICATIONCLASS PVCNAME DESIREDSTATE CURRENTSTATE
      volumereplication.replication.storage.openshift.io/busybox-pvc-41 22h rbd-volumereplicationclass-1625360775 busybox-pvc-41 primary Primary

      NAME DESIREDSTATE CURRENTSTATE
      volumereplicationgroup.ramendr.openshift.io/rbd-sub-busybox10-placement-1-drpc primary Primary

      NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
      pod/busybox-41-5c55b45d49-6vb7h 0/1 Pending 0 11h <none> <none> <none> <none>

      Here PVC and Pod is in terminating state

      oc get deploy
      NAME READY UP-TO-DATE AVAILABLE AGE
      busybox-41 0/1 1 0 11h

      From C2- NA

      Other outputs:

      C1-

      busybox-14
      Now using project "busybox-workloads-14" on server "https://api.amagrawa-c1-28aug.qe.rh-ocs.com:6443".
      NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE VOLUMEMODE
      persistentvolumeclaim/busybox-pvc-1 Bound pvc-3bd6202c-5aba-478b-86f4-571d9adb9334 94Gi RWX ocs-storagecluster-cephfs <unset> 2d13h Filesystem
      persistentvolumeclaim/volsync-busybox-pvc-1-src Bound pvc-b79186e4-a159-4896-8571-964fd041e5cc 94Gi ROX ocs-storagecluster-cephfs-vrg <unset> 28s Filesystem

      NAME DESIREDSTATE CURRENTSTATE
      volumereplicationgroup.ramendr.openshift.io/cephfs-sub-busybox14-placement-1-drpc primary Primary

      NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
      pod/volsync-rsync-tls-src-busybox-pvc-1-cttmh 1/1 Running 0 29s 10.129.3.110 compute-0 <none> <none>

      Here deployment/pod is lost

      oc get deploy
      No resources found in busybox-workloads-14 namespace.

      C2-

      busybox-14
      Now using project "busybox-workloads-14" on server "https://api.amagrawa-c2-28aug.qe.rh-ocs.com:6443".
      NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE VOLUMEMODE
      persistentvolumeclaim/busybox-pvc-1 Bound pvc-9e4f5561-4deb-43e2-9c13-eb98895fe550 94Gi RWX ocs-storagecluster-cephfs <unset> 2d13h Filesystem

      NAME DESIREDSTATE CURRENTSTATE
      volumereplicationgroup.ramendr.openshift.io/cephfs-sub-busybox14-placement-1-drpc secondary Secondary

      NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
      pod/volsync-rsync-tls-dst-busybox-pvc-1-bnbd7 1/1 Running 0 4m59s 10.128.2.204 compute-1 <none> <none>

      C1-

      busybox-6
      Now using project "busybox-workloads-6" on server "https://api.amagrawa-c1-28aug.qe.rh-ocs.com:6443".
      NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE VOLUMEMODE
      persistentvolumeclaim/busybox-pvc-1 Bound pvc-0e84e495-fe6e-45b1-8c27-9f4c92055b29 94Gi RWX ocs-storagecluster-cephfs <unset> 2d17h Filesystem

      NAME DESIREDSTATE CURRENTSTATE
      volumereplicationgroup.ramendr.openshift.io/cephfs-appset-busybox6-placement-drpc primary Primary

      Here deployment/pod is lost

      oc get deploy
      No resources found in busybox-workloads-6 namespace.

      C2-

      busybox-6
      Now using project "busybox-workloads-6" on server "https://api.amagrawa-c2-28aug.qe.rh-ocs.com:6443".
      NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE VOLUMEMODE
      persistentvolumeclaim/busybox-pvc-1 Bound pvc-867dedb7-d424-43ab-bf0e-ec213fa6b295 94Gi RWX ocs-storagecluster-cephfs <unset> 2d17h Filesystem

      NAME DESIREDSTATE CURRENTSTATE
      volumereplicationgroup.ramendr.openshift.io/cephfs-appset-busybox6-placement-drpc secondary Secondary

      NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
      pod/volsync-rsync-tls-dst-busybox-pvc-1-bfx9r 1/1 Running 0 56s 10.128.2.213 compute-1 <none> <none>

      C1-

      busybox-2
      Now using project "busybox-workloads-2" on server "https://api.amagrawa-c1-28aug.qe.rh-ocs.com:6443".
      NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE VOLUMEMODE
      persistentvolumeclaim/busybox-pvc-41 Terminating pvc-fe3f096a-323f-4592-93a8-376249d948f2 42Gi RWO ocs-storagecluster-ceph-rbd <unset> 22h Filesystem

      NAME AGE VOLUMEREPLICATIONCLASS PVCNAME DESIREDSTATE CURRENTSTATE
      volumereplication.replication.storage.openshift.io/busybox-pvc-41 22h rbd-volumereplicationclass-1625360775 busybox-pvc-41 primary Primary

      NAME DESIREDSTATE CURRENTSTATE
      volumereplicationgroup.ramendr.openshift.io/rbd-appset-busybox2-placement-drpc primary Primary

      Here PVC is in terminating state and deployment/pod is lost

      oc get deploy
      No resources found in busybox-workloads-2 namespace.

      C2- NA

      Expected results: Deployment/pod should not be lost and apps should be relocated successfully when relocate operation is triggered post hub recovery for above mentioned apps.

      Additional info:

      Logs collected after triggering relocate operation post hub recovery-

      http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/bz-aman/31aug24/

              bmekhiss Benamar Mekhissi
              amagrawa@redhat.com Aman Agrawal
              Benamar Mekhissi
              Aman Agrawal Aman Agrawal
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Created:
                Updated: