Loading...

Type: Bug
Resolution: Unresolved
Priority: Critical
Fix Version/s: None
Affects Version/s: odf-4.16
Component/s: odf-dr/ramen
Labels:
- No-Doc-Update

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Bugzilla Bug:
RHBZ: 2296264
Dev Approval:
?
QE Approval:
?
Target Release:

odf-4.21
Intelligence Requested:
Market:

Sprint:
RamenDR sprint 2024 #18, RamenDR sprint 2024 #19, RamenDR sprint 2024 #21

Regression:
None

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Description of problem (please be detailed as possible and provide log
snippests):
Hi ,

I am trying to do Recovery to replace cluster with MDR of Discovered Apps by following mentioned steps.

[1]
https://docs.redhat.com/en/documentation/red_hat_openshift_data_foundation/4.15/html/configuring_openshift_data_foundation_disaster_recovery_for_openshift_workloads/metro-dr-solution#recovering-to-a-replacement-cluster-with-mdr_manage-mdr

Followed below steps to disable DR of Discovered apps.
[2]
https://docs.google.com/document/d/1BoqbEqDBLCQZXp2qvd7Hw5mvg59njv1dqlrH6Hy7L58/edit#heading=h.1yx58g1ouy2

When I am trying to delete DRPC it get stuck in deleting state.Below is the drpc yaml output.

➜ hub oc get drpc imperative-1 -n openshift-dr-ops -oyaml
apiVersion: ramendr.openshift.io/v1alpha1
kind: DRPlacementControl
metadata:
annotations:
drplacementcontrol.ramendr.openshift.io/app-namespace: openshift-dr-ops
drplacementcontrol.ramendr.openshift.io/last-app-deployment-cluster: asagare-sec
creationTimestamp: "2024-07-04T07:39:18Z"
deletionGracePeriodSeconds: 0
deletionTimestamp: "2024-07-08T05:25:48Z"
finalizers:

drpc.ramendr.openshift.io/finalizer
generation: 6
labels:
cluster.open-cluster-management.io/backup: ramen
name: imperative-1
namespace: openshift-dr-ops
ownerReferences:
apiVersion: cluster.open-cluster-management.io/v1beta1
blockOwnerDeletion: true
controller: true
kind: Placement
name: imperative-1-placement-1
uid: 0e05a1af-f89a-48b1-aa0f-28cb89b14344
resourceVersion: "12464211"
uid: 2f372e03-fb46-4ec5-96d2-e517f9f88d09
spec:
action: Failover
drPolicyRef:
apiVersion: ramendr.openshift.io/v1alpha1
kind: DRPolicy
name: odr-policy-mdr
failoverCluster: asagare-sec
kubeObjectProtection:
captureInterval: 2m0s
kubeObjectSelector:
matchExpressions:
key: appname
operator: In
values:
busybox
placementRef:
apiVersion: cluster.open-cluster-management.io/v1beta1
kind: Placement
name: imperative-1-placement-1
namespace: openshift-dr-ops
preferredCluster: asagare-pri
protectedNamespaces:
busybox-discovered
pvcSelector:
matchExpressions:
key: appname
operator: In
values:
busybox
status:
actionStartTime: "2024-07-05T11:28:26Z"
conditions:
lastTransitionTime: "2024-07-05T11:28:27Z"
message: Completed
observedGeneration: 5
reason: FailedOver
status: "True"
type: Available
lastTransitionTime: "2024-07-05T11:28:26Z"
message: cleaning secondaries
observedGeneration: 5
reason: Cleaning
status: "False"
type: PeerReady
lastTransitionTime: "2024-07-05T11:29:57Z"
message: VolumeReplicationGroup (openshift-dr-ops/imperative-1) on cluster asagare-sec
is reporting errors (Cluster data of one or more PVs are unprotectedVRG Kube
object protect errorunable to ListKeys in DeleteObjects from endpoint https://s3-openshift-storage.apps.asagare-pri.qe.rh-ocs.com
bucket odrbucket-84427fcbc7ce keyPrefix openshift-dr-ops/imperative-1/kube-objects/1/velero/backups/)
protecting workload resources, retrying till ClusterDataProtected condition
is met
observedGeneration: 5
reason: Error
status: "False"
type: Protected
lastKubeObjectProtectionTime: "2024-07-05T11:20:51Z"
lastUpdateTime: "2024-07-07T19:49:50Z"
observedGeneration: 6
phase: Deleting
preferredDecision:
clusterName: asagare-pri
clusterNamespace: asagare-pri
progression: Deleting
resourceConditions:
conditions:
lastTransitionTime: "2024-07-05T11:29:52Z"
message: PVCs in the VolumeReplicationGroup are ready for use
observedGeneration: 1
reason: Ready
status: "True"
type: DataReady
lastTransitionTime: "2024-07-05T11:29:52Z"
message: VolumeReplicationGroup is replicating
observedGeneration: 1
reason: Replicating
status: "False"
type: DataProtected
lastTransitionTime: "2024-07-05T11:29:16Z"
message: Restored PVs and PVCs
observedGeneration: 1
reason: Restored
status: "True"
type: ClusterDataReady
lastTransitionTime: "2024-07-05T11:29:52Z"
message: Cluster data of one or more PVs are unprotectedVRG Kube object protect
errorunable to ListKeys in DeleteObjects from endpoint https://s3-openshift-storage.apps.asagare-pri.qe.rh-ocs.com
bucket odrbucket-84427fcbc7ce keyPrefix openshift-dr-ops/imperative-1/kube-objects/1/velero/backups/
observedGeneration: 1
reason: UploadError
status: "False"
type: ClusterDataProtected
resourceMeta:
generation: 1
kind: VolumeReplicationGroup
name: imperative-1
namespace: openshift-dr-ops
protectedpvcs:
busybox-pvc
resourceVersion: "7885562"

Version of all relevant components (if applicable):

OCP: 4.16.0-0.nightly-2024-06-27-091410
ODF: 4.16.0-134
ACM: 2.11.0-140
CEPH: 18.2.1-194.el9cp (04a992766839cd3207877e518a1238cdbac3787e) reef (stable)
OADP: 1.4.0
GitOps: 1.12.4

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?

Is there any workaround available to the best of your knowledge?

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?

Can this issue reproducible?
yes

Can this issue reproduce from the UI?

If this is a regression, please provide more details to justify this:

Steps to Reproduce:
1. I have upgraded MDR setup from 4.15.4 to 4.16
2. Deployed and applied DRpolicy to Discovered apps on primary cluster 1
3. Powered off primary cluster.
4. Followed steps for replace cluster mentioned in doc[1]
5. Disabled DR for protected apps using doc[2]
6. Drpc deletion stuck in deleting state.

Actual results:
drpc deletion stuck in deleting state.

Expected results:
drpc should get deleted.

Additional info:

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty