-
Bug
-
Resolution: Unresolved
-
Critical
-
odf-4.16
-
False
-
-
False
-
4.17.1
-
?
-
4.17.0-103
-
?
-
-
-
RamenDR sprint 2024 #16, RamenDR sprint 2024 #17
-
None
Description of problem (please be detailed as possible and provide log
snippests):
Version of all relevant components (if applicable):
OCP 4.16.0-0.nightly-2024-08-29-060830
ODF 4.16.1-8
ACM 2.11.2 GA'ed
OADP 1.4.0
MCE 2.6.2
RH Gitops 1.13.1
Submariner 0.18.0
VolSync 0.10.0
ceph version 18.2.1-229.el9cp (ef652b206f2487adfc86613646a4cac946f6b4e0) reef (stable)
Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Is there any workaround available to the best of your knowledge?
Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
Can this issue reproducible?
Can this issue reproduce from the UI?
If this is a regression, please provide more details to justify this:
Steps to Reproduce:
1. On a RDR setup with multiple workloads, rbd-appset(pull)/sub, cephfs-appset(pull)/sub all in Deployed, FailedOver and Relocated state running on any one of the managed clusters and rbd-appset(pull)/sub, cephfs-appset(pull)/sub 1 each in Deployed state on another managed cluster,then configure it for hub recovery but do not start taking new backups.
2. Before backups are taken, ensure the above state is achieved.
3. Now start taking backups and when we have 1 or 2 successful backups, either stop the backup or increase the backup time to allow certain action in between so that no new backup is taken.
Collect outputs and note down other observations.
4. Now failover/relocate the workloads which are already in FailedOver or Relocated state to another cluster.
Meaning, move the workloads which are primary on C1 to C2 or the other way round. Let the workloads in Deployed state remain as it is on both the managed clusters.
5. Make sure that the latest state of workloads and drpc is *NOT* backed up as mentioned in Step 3 above. We do not want to latest backups to be taken.
Collect outputs and note down other observations.
After all the operations complete, let IOs run for some time and then
perform hub recovery by bringing active hub cluster down.
6. After moving to new hub, ensure drpolicy is validated and drpc is restored.
7. Check the drpc status (it should match with the last backed up state of drpc as in Step 3 above) before we stopped taking backups.
8. Now workloads will try to move to different managed cluster as per drpc state which is restored. Apps in relocated state will waitforuser action.
9. Relocate all such apps via ACM UI
Actual results:
================================================================================================================================================================
DRPC state when backup was taken:
At date -u
Fri Aug 30 10:26:25 UTC 2024
Backups on active hub
backup
NAMESPACE NAME AGE
open-cluster-management-backup acm-credentials-schedule-20240830101055 14m
open-cluster-management-backup acm-credentials-schedule-20240830101554 9m34s
open-cluster-management-backup acm-credentials-schedule-20240830102054 4m34s
open-cluster-management-backup acm-managed-clusters-schedule-20240830101055 14m
open-cluster-management-backup acm-managed-clusters-schedule-20240830101554 9m34s
open-cluster-management-backup acm-managed-clusters-schedule-20240830102054 4m34s
open-cluster-management-backup acm-resources-generic-schedule-20240830101055 14m
open-cluster-management-backup acm-resources-generic-schedule-20240830101554 9m34s
open-cluster-management-backup acm-resources-generic-schedule-20240830102054 4m34s
open-cluster-management-backup acm-resources-schedule-20240830101055 14m
open-cluster-management-backup acm-resources-schedule-20240830101554 9m34s
open-cluster-management-backup acm-resources-schedule-20240830102054 4m34s
open-cluster-management-backup acm-validation-policy-schedule-20240830101554 9m34s
open-cluster-management-backup acm-validation-policy-schedule-20240830102054 4m34s
////////////
NAME PHASE MESSAGE
schedule-acm Enabled Velero schedules are enabled
////////////
NAME REMEDIATION ACTION COMPLIANCE STATE AGE
backup-restore-enabled inform Compliant 2d12h
////////////
NAME PHASE LAST VALIDATED AGE DEFAULT
default Available 43s 2d12h true
////////////
NAME STATUS SCHEDULE LASTBACKUP AGE PAUSED
acm-credentials-schedule Enabled 0 */99 * * * 4m38s 14m
acm-managed-clusters-schedule Enabled 0 */99 * * * 4m38s 14m
acm-resources-generic-schedule Enabled 0 */99 * * * 4m38s 14m
acm-resources-schedule Enabled 0 */99 * * * 4m38s 14m
acm-validation-policy-schedule Enabled 0 */99 * * * 4m38s 14m
Backups on passive hub
backup
NAMESPACE NAME AGE
open-cluster-management-backup acm-credentials-schedule-20240830101055 15m
open-cluster-management-backup acm-credentials-schedule-20240830101554 10m
open-cluster-management-backup acm-credentials-schedule-20240830102054 5m6s
open-cluster-management-backup acm-managed-clusters-schedule-20240830101055 15m
open-cluster-management-backup acm-managed-clusters-schedule-20240830101554 10m
open-cluster-management-backup acm-managed-clusters-schedule-20240830102054 5m5s
open-cluster-management-backup acm-resources-generic-schedule-20240830101055 13m
open-cluster-management-backup acm-resources-generic-schedule-20240830101554 8m6s
open-cluster-management-backup acm-resources-generic-schedule-20240830102054 3m6s
open-cluster-management-backup acm-resources-schedule-20240830101055 13m
open-cluster-management-backup acm-resources-schedule-20240830101554 8m5s
open-cluster-management-backup acm-resources-schedule-20240830102054 3m5s
open-cluster-management-backup acm-validation-policy-schedule-20240830101554 8m5s
open-cluster-management-backup acm-validation-policy-schedule-20240830102054 3m5s
////////////
No resources found in open-cluster-management-backup namespace.
////////////
NAME REMEDIATION ACTION COMPLIANCE STATE AGE
backup-restore-enabled inform Compliant 16h
////////////
NAME PHASE LAST VALIDATED AGE DEFAULT
default Available 10s 3h12m true
////////////
No resources found in open-cluster-management-backup namespace.
DRPC at active hub
drpc
////////////////////////////////
Fri Aug 30 10:26:49 UTC 2024
*******
NAMESPACE NAME AGE PREFERREDCLUSTER FAILOVERCLUSTER DESIREDSTATE CURRENTSTATE PROGRESSION START TIME DURATION PEER READY
busybox-workloads-10 rbd-sub-busybox10-placement-1-drpc 38h amagrawa-c2-28aug amagrawa-c1-28aug Relocate Relocated Completed 2024-08-29T11:43:31Z 4m7.14536939s True
busybox-workloads-11 rbd-sub-busybox11-placement-1-drpc 38h amagrawa-c1-28aug Deployed Completed 2024-08-28T19:35:56Z 16.037619869s True
busybox-workloads-12 rbd-sub-busybox12-placement-1-drpc 38h amagrawa-c2-28aug Deployed Completed 2024-08-28T19:36:52Z 22.031782646s True
busybox-workloads-13 cephfs-sub-busybox13-placement-1-drpc 38h amagrawa-c1-28aug amagrawa-c2-28aug Failover FailedOver Completed 2024-08-29T11:45:04Z 2m39.777750435s True
busybox-workloads-14 cephfs-sub-busybox14-placement-1-drpc 38h amagrawa-c2-28aug amagrawa-c1-28aug Relocate Relocated Completed 2024-08-29T11:45:13Z 3m27.543350385s True
busybox-workloads-15 cephfs-sub-busybox15-placement-1-drpc 38h amagrawa-c1-28aug Deployed Completed 2024-08-28T19:49:28Z 37.092845577s True
busybox-workloads-16 cephfs-sub-busybox16-placement-1-drpc 38h amagrawa-c2-28aug Deployed Completed 2024-08-28T19:50:18Z 43.102846191s True
busybox-workloads-9 rbd-sub-busybox9-placement-1-drpc 38h amagrawa-c1-28aug amagrawa-c2-28aug Failover FailedOver Completed 2024-08-29T11:43:17Z 4m40.034887668s True
openshift-gitops cephfs-appset-busybox5-placement-drpc 42h amagrawa-c1-28aug amagrawa-c2-28aug Failover FailedOver Completed 2024-08-29T11:44:45Z 2m48.484996609s True
openshift-gitops cephfs-appset-busybox6-placement-drpc 42h amagrawa-c2-28aug amagrawa-c1-28aug Relocate Relocated Completed 2024-08-29T11:44:50Z 5m20.161782009s True
openshift-gitops cephfs-appset-busybox7-placement-drpc 42h amagrawa-c1-28aug Deployed Completed 2024-08-28T16:15:35Z 45.117386476s True
openshift-gitops cephfs-appset-busybox8-placement-drpc 42h amagrawa-c2-28aug Deployed Completed 2024-08-28T16:16:28Z 48.103711373s True
openshift-gitops rbd-appset-busybox1-placement-drpc 42h amagrawa-c1-28aug amagrawa-c2-28aug Failover FailedOver Completed 2024-08-29T11:42:58Z 5m21.790526304s True
openshift-gitops rbd-appset-busybox2-placement-drpc 42h amagrawa-c2-28aug amagrawa-c1-28aug Relocate Relocated Completed 2024-08-29T11:43:05Z 6m34.107132984s True
openshift-gitops rbd-appset-busybox3-placement-drpc 42h amagrawa-c1-28aug Deployed Completed 2024-08-28T16:11:39Z 5.045949912s True
openshift-gitops rbd-appset-busybox4-placement-drpc 42h amagrawa-c2-28aug Deployed Completed 2024-08-28T16:12:36Z 1.042272661s True
================================================================================================================================================================
DRPC state after backup was stopped:
From active hub-
drpc
////////////////////////////////
Fri Aug 30 17:12:07 UTC 2024
*******
NAMESPACE NAME AGE PREFERREDCLUSTER FAILOVERCLUSTER DESIREDSTATE CURRENTSTATE PROGRESSION START TIME DURATION PEER READY
busybox-workloads-10 rbd-sub-busybox10-placement-1-drpc 45h amagrawa-c1-28aug amagrawa-c2-28aug Relocate Relocated Completed 2024-08-30T10:33:03Z 4m31.136839895s True
busybox-workloads-11 rbd-sub-busybox11-placement-1-drpc 45h amagrawa-c1-28aug Deployed Completed 2024-08-28T19:35:56Z 16.037619869s True
busybox-workloads-12 rbd-sub-busybox12-placement-1-drpc 45h amagrawa-c2-28aug Deployed Completed 2024-08-28T19:36:52Z 22.031782646s True
busybox-workloads-13 cephfs-sub-busybox13-placement-1-drpc 45h amagrawa-c1-28aug amagrawa-c2-28aug Relocate Relocated Completed 2024-08-30T10:31:41Z 3m34.276720598s True
busybox-workloads-14 cephfs-sub-busybox14-placement-1-drpc 45h amagrawa-c2-28aug amagrawa-c1-28aug Failover FailedOver Completed 2024-08-30T10:31:49Z 3m11.99710934s True
busybox-workloads-15 cephfs-sub-busybox15-placement-1-drpc 45h amagrawa-c1-28aug Deployed Completed 2024-08-28T19:49:28Z 37.092845577s True
busybox-workloads-16 cephfs-sub-busybox16-placement-1-drpc 45h amagrawa-c2-28aug Deployed Completed 2024-08-28T19:50:18Z 43.102846191s True
busybox-workloads-9 rbd-sub-busybox9-placement-1-drpc 45h amagrawa-c2-28aug amagrawa-c1-28aug Failover FailedOver Completed 2024-08-30T10:32:55Z 4m25.040487678s True
openshift-gitops cephfs-appset-busybox5-placement-drpc 2d amagrawa-c2-28aug amagrawa-c1-28aug Failover FailedOver Completed 2024-08-30T10:32:11Z 3m9.894131846s True
openshift-gitops cephfs-appset-busybox6-placement-drpc 2d amagrawa-c1-28aug amagrawa-c2-28aug Relocate Relocated Completed 2024-08-30T10:32:18Z 3m17.106363595s True
openshift-gitops cephfs-appset-busybox7-placement-drpc 2d amagrawa-c1-28aug Deployed Completed 2024-08-28T16:15:35Z 45.117386476s True
openshift-gitops cephfs-appset-busybox8-placement-drpc 2d amagrawa-c2-28aug Deployed Completed 2024-08-28T16:16:28Z 48.103711373s True
openshift-gitops rbd-appset-busybox1-placement-drpc 2d1h amagrawa-c1-28aug amagrawa-c2-28aug Relocate Relocated Completed 2024-08-30T10:33:24Z 5m26.300788158s True
openshift-gitops rbd-appset-busybox2-placement-drpc 2d1h amagrawa-c2-28aug amagrawa-c1-28aug Failover FailedOver Completed 2024-08-30T10:33:32Z 4m48.836673355s True
openshift-gitops rbd-appset-busybox3-placement-drpc 2d1h amagrawa-c1-28aug Deployed Completed 2024-08-28T16:11:39Z 5.045949912s True
openshift-gitops rbd-appset-busybox4-placement-drpc 2d1h amagrawa-c2-28aug Deployed Completed 2024-08-28T16:12:36Z 1.042272661s True
group
******************************
Fri Aug 30 17:12:12 UTC 2024
*******
drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-10
namespace: busybox-workloads-10
namespace: busybox-workloads-10
lastGroupSyncTime: "2024-08-30T17:10:00Z"
namespace: busybox-workloads-10
drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-11
namespace: busybox-workloads-11
namespace: busybox-workloads-11
lastGroupSyncTime: "2024-08-30T17:10:00Z"
namespace: busybox-workloads-11
drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-12
namespace: busybox-workloads-12
namespace: busybox-workloads-12
lastGroupSyncTime: "2024-08-30T17:10:00Z"
namespace: busybox-workloads-12
drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-13
namespace: busybox-workloads-13
namespace: busybox-workloads-13
lastGroupSyncTime: "2024-08-30T17:10:45Z"
namespace: busybox-workloads-13
drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-14
namespace: busybox-workloads-14
namespace: busybox-workloads-14
lastGroupSyncTime: "2024-08-30T17:10:45Z"
namespace: busybox-workloads-14
drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-15
namespace: busybox-workloads-15
namespace: busybox-workloads-15
lastGroupSyncTime: "2024-08-30T17:10:57Z"
namespace: busybox-workloads-15
drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-16
namespace: busybox-workloads-16
namespace: busybox-workloads-16
lastGroupSyncTime: "2024-08-30T17:11:05Z"
namespace: busybox-workloads-16
drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-9
namespace: busybox-workloads-9
namespace: busybox-workloads-9
lastGroupSyncTime: "2024-08-30T17:10:00Z"
namespace: busybox-workloads-9
drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-5
namespace: openshift-gitops
namespace: openshift-gitops
lastGroupSyncTime: "2024-08-30T17:10:46Z"
namespace: busybox-workloads-5
drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-6
namespace: openshift-gitops
namespace: openshift-gitops
lastGroupSyncTime: "2024-08-30T17:10:48Z"
namespace: busybox-workloads-6
drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-7
namespace: openshift-gitops
namespace: openshift-gitops
lastGroupSyncTime: "2024-08-30T17:11:02Z"
namespace: busybox-workloads-7
drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-8
namespace: openshift-gitops
namespace: openshift-gitops
lastGroupSyncTime: "2024-08-30T17:11:13Z"
namespace: busybox-workloads-8
drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-1
namespace: openshift-gitops
namespace: openshift-gitops
lastGroupSyncTime: "2024-08-30T17:10:00Z"
namespace: busybox-workloads-1
drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-2
namespace: openshift-gitops
namespace: openshift-gitops
lastGroupSyncTime: "2024-08-30T17:10:00Z"
namespace: busybox-workloads-2
drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-3
namespace: openshift-gitops
namespace: openshift-gitops
lastGroupSyncTime: "2024-08-30T17:10:00Z"
namespace: busybox-workloads-3
drplacementcontrol.ramendr.openshift.io/app-namespace: busybox-workloads-4
namespace: openshift-gitops
namespace: openshift-gitops
lastGroupSyncTime: "2024-08-30T17:10:00Z"
namespace: busybox-workloads-4
date -u
Fri Aug 30 17:12:15 UTC 2024
backup
NAMESPACE NAME AGE
open-cluster-management-backup acm-credentials-schedule-20240830101055 7h1m
open-cluster-management-backup acm-credentials-schedule-20240830101554 6h56m
open-cluster-management-backup acm-credentials-schedule-20240830102054 6h51m
open-cluster-management-backup acm-managed-clusters-schedule-20240830101055 7h1m
open-cluster-management-backup acm-managed-clusters-schedule-20240830101554 6h56m
open-cluster-management-backup acm-managed-clusters-schedule-20240830102054 6h51m
open-cluster-management-backup acm-resources-generic-schedule-20240830101055 7h1m
open-cluster-management-backup acm-resources-generic-schedule-20240830101554 6h56m
open-cluster-management-backup acm-resources-generic-schedule-20240830102054 6h51m
open-cluster-management-backup acm-resources-schedule-20240830101055 7h1m
open-cluster-management-backup acm-resources-schedule-20240830101554 6h56m
open-cluster-management-backup acm-resources-schedule-20240830102054 6h51m
////////////
NAME PHASE MESSAGE
schedule-acm Enabled Velero schedules are enabled
////////////
NAME REMEDIATION ACTION COMPLIANCE STATE AGE
backup-restore-enabled inform NonCompliant 2d19h
////////////
NAME PHASE LAST VALIDATED AGE DEFAULT
default Available 37s 2d19h true
////////////
NAME STATUS SCHEDULE LASTBACKUP AGE PAUSED
acm-credentials-schedule Enabled 0 */99 * * * 6h51m 7h1m
acm-managed-clusters-schedule Enabled 0 */99 * * * 6h51m 7h1m
acm-resources-generic-schedule Enabled 0 */99 * * * 6h51m 7h1m
acm-resources-schedule Enabled 0 */99 * * * 6h51m 7h1m
acm-validation-policy-schedule Enabled 0 */99 * * * 6h51m 7h1m
From passive hub-
backup
NAMESPACE NAME AGE
open-cluster-management-backup acm-credentials-schedule-20240830101055 7h1m
open-cluster-management-backup acm-credentials-schedule-20240830101554 6h56m
open-cluster-management-backup acm-credentials-schedule-20240830102054 6h51m
open-cluster-management-backup acm-managed-clusters-schedule-20240830101055 7h1m
open-cluster-management-backup acm-managed-clusters-schedule-20240830101554 6h56m
open-cluster-management-backup acm-managed-clusters-schedule-20240830102054 6h51m
open-cluster-management-backup acm-resources-generic-schedule-20240830101055 6h59m
open-cluster-management-backup acm-resources-generic-schedule-20240830101554 6h54m
open-cluster-management-backup acm-resources-generic-schedule-20240830102054 6h49m
open-cluster-management-backup acm-resources-schedule-20240830101055 6h59m
open-cluster-management-backup acm-resources-schedule-20240830101554 6h54m
open-cluster-management-backup acm-resources-schedule-20240830102054 6h49m
////////////
No resources found in open-cluster-management-backup namespace.
////////////
NAME REMEDIATION ACTION COMPLIANCE STATE AGE
backup-restore-enabled inform NonCompliant 23h
////////////
NAME PHASE LAST VALIDATED AGE DEFAULT
default Available 22s 9h true
////////////
No resources found in open-cluster-management-backup namespace.
================================================================================
Active hub was brought down around Fri Aug 30 17:14:12 UTC 2024
================================================================================
================================================================================
Restored backups on passive hub at around date -u
Fri Aug 30 17:23:02 UTC 2024
================================================================================
================================================================================
DRpolicy got validated at about date -u
Fri Aug 30 17:24:28 UTC 2024
================================================================================
DRPC state after hub recovery:
From new active hub-
drpc
////////////////////////////////
Fri Aug 30 21:51:31 UTC 2024
*******
NAMESPACE NAME AGE PREFERREDCLUSTER FAILOVERCLUSTER DESIREDSTATE CURRENTSTATE PROGRESSION START TIME DURATION PEER READY
busybox-workloads-10 rbd-sub-busybox10-placement-1-drpc 4h30m amagrawa-c2-28aug amagrawa-c1-28aug Relocate WaitForUser Paused True
busybox-workloads-11 rbd-sub-busybox11-placement-1-drpc 4h30m amagrawa-c1-28aug Deployed Completed 2024-08-30T17:24:36Z 898.25644ms True
busybox-workloads-12 rbd-sub-busybox12-placement-1-drpc 4h30m amagrawa-c2-28aug Deployed Completed 2024-08-30T17:24:59Z 997.389937ms True
busybox-workloads-13 cephfs-sub-busybox13-placement-1-drpc 4h30m amagrawa-c1-28aug amagrawa-c2-28aug Failover FailedOver Completed 2024-08-30T17:24:59Z 2m37.256839681s True
busybox-workloads-14 cephfs-sub-busybox14-placement-1-drpc 4h30m amagrawa-c2-28aug amagrawa-c1-28aug Relocate WaitForUser Paused True
busybox-workloads-15 cephfs-sub-busybox15-placement-1-drpc 4h30m amagrawa-c1-28aug Deployed Completed 2024-08-30T17:24:59Z 1.797831304s True
busybox-workloads-16 cephfs-sub-busybox16-placement-1-drpc 4h30m amagrawa-c2-28aug Deployed Completed 2024-08-30T17:24:59Z 1.297596905s True
busybox-workloads-9 rbd-sub-busybox9-placement-1-drpc 4h30m amagrawa-c1-28aug amagrawa-c2-28aug Failover FailedOver Completed 2024-08-30T17:24:37Z 1h4m2.552357112s True
openshift-gitops cephfs-appset-busybox5-placement-drpc 4h30m amagrawa-c1-28aug amagrawa-c2-28aug Failover FailedOver Completed 2024-08-30T17:25:00Z 2m36.525322643s True
openshift-gitops cephfs-appset-busybox6-placement-drpc 4h30m amagrawa-c2-28aug amagrawa-c1-28aug Relocate WaitForUser Paused True
openshift-gitops cephfs-appset-busybox7-placement-drpc 4h30m amagrawa-c1-28aug Deployed Completed 2024-08-30T17:24:59Z 996.947009ms True
openshift-gitops cephfs-appset-busybox8-placement-drpc 4h30m amagrawa-c2-28aug Deployed Completed 2024-08-30T17:25:02Z 597.880944ms True
openshift-gitops rbd-appset-busybox1-placement-drpc 4h30m amagrawa-c1-28aug amagrawa-c2-28aug Failover FailedOver Completed 2024-08-30T17:24:36Z 1h3m32.768428942s True
openshift-gitops rbd-appset-busybox2-placement-drpc 4h30m amagrawa-c2-28aug amagrawa-c1-28aug Relocate WaitForUser Paused True
openshift-gitops rbd-appset-busybox3-placement-drpc 4h30m amagrawa-c1-28aug Deployed Completed 2024-08-30T17:24:36Z 297.132956ms True
openshift-gitops rbd-appset-busybox4-placement-drpc 4h30m amagrawa-c2-28aug Deployed Completed 2024-08-30T17:24:59Z 1.498228814s True
================================================================================
Then I relocated below apps via ACM UI around Fri Aug 30 21:53:30 UTC 2024
(This is after evictiontimeout period of 1hr passed)
rbd-sub-busybox10-placement-1-drpc
cephfs-sub-busybox14-placement-1-drpc
cephfs-appset-busybox6-placement-drpc
rbd-appset-busybox2-placement-drpc
DRPC state after relocate-
drpc
////////////////////////////////
Fri Aug 30 22:14:38 UTC 2024
*******
NAMESPACE NAME AGE PREFERREDCLUSTER FAILOVERCLUSTER DESIREDSTATE CURRENTSTATE PROGRESSION START TIME DURATION PEER READY
busybox-workloads-10 rbd-sub-busybox10-placement-1-drpc 4h53m amagrawa-c1-28aug amagrawa-c2-28aug Relocate Relocated Completed True
busybox-workloads-11 rbd-sub-busybox11-placement-1-drpc 4h53m amagrawa-c1-28aug Deployed Completed 2024-08-30T17:24:36Z 898.25644ms True
busybox-workloads-12 rbd-sub-busybox12-placement-1-drpc 4h53m amagrawa-c2-28aug Deployed Completed 2024-08-30T17:24:59Z 997.389937ms True
busybox-workloads-13 cephfs-sub-busybox13-placement-1-drpc 4h53m amagrawa-c1-28aug amagrawa-c2-28aug Failover FailedOver Completed 2024-08-30T17:24:59Z 2m37.256839681s True
busybox-workloads-14 cephfs-sub-busybox14-placement-1-drpc 4h53m amagrawa-c1-28aug amagrawa-c2-28aug Relocate WaitForUser Paused True
busybox-workloads-15 cephfs-sub-busybox15-placement-1-drpc 4h53m amagrawa-c1-28aug Deployed Completed 2024-08-30T17:24:59Z 1.797831304s True
busybox-workloads-16 cephfs-sub-busybox16-placement-1-drpc 4h53m amagrawa-c2-28aug Deployed Completed 2024-08-30T17:24:59Z 1.297596905s True
busybox-workloads-9 rbd-sub-busybox9-placement-1-drpc 4h53m amagrawa-c1-28aug amagrawa-c2-28aug Failover FailedOver Completed 2024-08-30T17:24:37Z 1h4m2.552357112s True
openshift-gitops cephfs-appset-busybox5-placement-drpc 4h53m amagrawa-c1-28aug amagrawa-c2-28aug Failover FailedOver Completed 2024-08-30T17:25:00Z 2m36.525322643s True
openshift-gitops cephfs-appset-busybox6-placement-drpc 4h53m amagrawa-c2-28aug amagrawa-c1-28aug Relocate WaitForUser Paused True
openshift-gitops cephfs-appset-busybox7-placement-drpc 4h53m amagrawa-c1-28aug Deployed Completed 2024-08-30T17:24:59Z 996.947009ms True
openshift-gitops cephfs-appset-busybox8-placement-drpc 4h53m amagrawa-c2-28aug Deployed Completed 2024-08-30T17:25:02Z 597.880944ms True
openshift-gitops rbd-appset-busybox1-placement-drpc 4h53m amagrawa-c1-28aug amagrawa-c2-28aug Failover FailedOver Completed 2024-08-30T17:24:36Z 1h3m32.768428942s True
openshift-gitops rbd-appset-busybox2-placement-drpc 4h53m amagrawa-c1-28aug amagrawa-c2-28aug Relocate WaitForUser Paused True
openshift-gitops rbd-appset-busybox3-placement-drpc 4h53m amagrawa-c1-28aug Deployed Completed 2024-08-30T17:24:36Z 297.132956ms True
openshift-gitops rbd-appset-busybox4-placement-drpc 4h53m amagrawa-c2-28aug Deployed Completed 2024-08-30T17:24:59Z 1.498228814s True
================================================================================================================================================================
There was no change in the drpc state for workloads
cephfs-sub-busybox14-placement-1-drpc
cephfs-appset-busybox6-placement-drpc
rbd-appset-busybox2-placement-drpc
For rbd-sub-busybox10-placement-1-drpc, Relocate was marked as Completed but the workload still had issues.
NAMESPACE NAME AGE PREFERREDCLUSTER FAILOVERCLUSTER DESIREDSTATE CURRENTSTATE PROGRESSION START TIME DURATION PEER READY
busybox-workloads-10 rbd-sub-busybox10-placement-1-drpc 4h53m amagrawa-c1-28aug amagrawa-c2-28aug Relocate Relocated Completed True
C1-
busybox-10
Already on project "busybox-workloads-10" on server "https://api.amagrawa-c1-28aug.qe.rh-ocs.com:6443".
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE VOLUMEMODE
persistentvolumeclaim/busybox-pvc-41 Terminating pvc-6c950553-71ad-4eb6-ac17-7ada3c5a7c2c 42Gi RWO ocs-storagecluster-ceph-rbd <unset> 22h Filesystem
NAME AGE VOLUMEREPLICATIONCLASS PVCNAME DESIREDSTATE CURRENTSTATE
volumereplication.replication.storage.openshift.io/busybox-pvc-41 22h rbd-volumereplicationclass-1625360775 busybox-pvc-41 primary Primary
NAME DESIREDSTATE CURRENTSTATE
volumereplicationgroup.ramendr.openshift.io/rbd-sub-busybox10-placement-1-drpc primary Primary
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/busybox-41-5c55b45d49-6vb7h 0/1 Pending 0 11h <none> <none> <none> <none>
Here PVC and Pod is in terminating state
oc get deploy
NAME READY UP-TO-DATE AVAILABLE AGE
busybox-41 0/1 1 0 11h
From C2- NA
Other outputs:
C1-
busybox-14
Now using project "busybox-workloads-14" on server "https://api.amagrawa-c1-28aug.qe.rh-ocs.com:6443".
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE VOLUMEMODE
persistentvolumeclaim/busybox-pvc-1 Bound pvc-3bd6202c-5aba-478b-86f4-571d9adb9334 94Gi RWX ocs-storagecluster-cephfs <unset> 2d13h Filesystem
persistentvolumeclaim/volsync-busybox-pvc-1-src Bound pvc-b79186e4-a159-4896-8571-964fd041e5cc 94Gi ROX ocs-storagecluster-cephfs-vrg <unset> 28s Filesystem
NAME DESIREDSTATE CURRENTSTATE
volumereplicationgroup.ramendr.openshift.io/cephfs-sub-busybox14-placement-1-drpc primary Primary
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/volsync-rsync-tls-src-busybox-pvc-1-cttmh 1/1 Running 0 29s 10.129.3.110 compute-0 <none> <none>
Here deployment/pod is lost
oc get deploy
No resources found in busybox-workloads-14 namespace.
C2-
busybox-14
Now using project "busybox-workloads-14" on server "https://api.amagrawa-c2-28aug.qe.rh-ocs.com:6443".
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE VOLUMEMODE
persistentvolumeclaim/busybox-pvc-1 Bound pvc-9e4f5561-4deb-43e2-9c13-eb98895fe550 94Gi RWX ocs-storagecluster-cephfs <unset> 2d13h Filesystem
NAME DESIREDSTATE CURRENTSTATE
volumereplicationgroup.ramendr.openshift.io/cephfs-sub-busybox14-placement-1-drpc secondary Secondary
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/volsync-rsync-tls-dst-busybox-pvc-1-bnbd7 1/1 Running 0 4m59s 10.128.2.204 compute-1 <none> <none>
C1-
busybox-6
Now using project "busybox-workloads-6" on server "https://api.amagrawa-c1-28aug.qe.rh-ocs.com:6443".
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE VOLUMEMODE
persistentvolumeclaim/busybox-pvc-1 Bound pvc-0e84e495-fe6e-45b1-8c27-9f4c92055b29 94Gi RWX ocs-storagecluster-cephfs <unset> 2d17h Filesystem
NAME DESIREDSTATE CURRENTSTATE
volumereplicationgroup.ramendr.openshift.io/cephfs-appset-busybox6-placement-drpc primary Primary
Here deployment/pod is lost
oc get deploy
No resources found in busybox-workloads-6 namespace.
C2-
busybox-6
Now using project "busybox-workloads-6" on server "https://api.amagrawa-c2-28aug.qe.rh-ocs.com:6443".
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE VOLUMEMODE
persistentvolumeclaim/busybox-pvc-1 Bound pvc-867dedb7-d424-43ab-bf0e-ec213fa6b295 94Gi RWX ocs-storagecluster-cephfs <unset> 2d17h Filesystem
NAME DESIREDSTATE CURRENTSTATE
volumereplicationgroup.ramendr.openshift.io/cephfs-appset-busybox6-placement-drpc secondary Secondary
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/volsync-rsync-tls-dst-busybox-pvc-1-bfx9r 1/1 Running 0 56s 10.128.2.213 compute-1 <none> <none>
C1-
busybox-2
Now using project "busybox-workloads-2" on server "https://api.amagrawa-c1-28aug.qe.rh-ocs.com:6443".
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE VOLUMEMODE
persistentvolumeclaim/busybox-pvc-41 Terminating pvc-fe3f096a-323f-4592-93a8-376249d948f2 42Gi RWO ocs-storagecluster-ceph-rbd <unset> 22h Filesystem
NAME AGE VOLUMEREPLICATIONCLASS PVCNAME DESIREDSTATE CURRENTSTATE
volumereplication.replication.storage.openshift.io/busybox-pvc-41 22h rbd-volumereplicationclass-1625360775 busybox-pvc-41 primary Primary
NAME DESIREDSTATE CURRENTSTATE
volumereplicationgroup.ramendr.openshift.io/rbd-appset-busybox2-placement-drpc primary Primary
Here PVC is in terminating state and deployment/pod is lost
oc get deploy
No resources found in busybox-workloads-2 namespace.
C2- NA
Expected results: Deployment/pod should not be lost and apps should be relocated successfully when relocate operation is triggered post hub recovery for above mentioned apps.
Additional info:
Logs collected after triggering relocate operation post hub recovery-
http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/bz-aman/31aug24/
- external trackers