Uploaded image for project: 'Data Foundation Bugs'
  1. Data Foundation Bugs
  2. DFBUGS-319

[2321510] [RDR][4.17 clone] Relocate of ceph fs is stuck in WaitForReadiness


    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • odf-4.17.5
    • odf-4.16
    • odf-dr/ramen
    • RamenDR sprint 2024 #21
    • Proposed
    • None

      This bug was initially created as a copy of Bug #2319334 for 4.17.z release.

      I am copying this bug because:

      Description of problem (please be detailed as possible and provide log

      [RDR] Relocate of ceph fs is stuck in WaitForReadiness

      Version of all relevant components (if applicable):
      OCS operator 4.16.3-2
      Cluster Version 4.16.0-0.nightly-2024-10-12-102620
      acm_version 2.11.3
      gitops_version 1.14.0
      submariner_version 0.18.0

      Does this issue impact your ability to continue to work with the product
      (please explain in detail what is the user impact)?

      Is there any workaround available to the best of your knowledge?

      Rate from 1 - 5 the complexity of the scenario you performed that caused this
      bug (1 - very simple, 5 - very complex)?

      Can this issue reproducible?

      Can this issue reproduce from the UI?

      If this is a regression, please provide more details to justify this:

      Steps to Reproduce:
      1.Deploy 4.16.3 RDR cluster
      2.Deploy ceph fs workloads
      3. Relocate cephfs worklods

      Actual results:

      oc describe drpc busybox-3-placement-cephfs-drpc -n openshift-gitops
      Name: busybox-3-placement-cephfs-drpc
      Namespace: openshift-gitops
      Labels: cluster.open-cluster-management.io/backup=ramen
      Annotations: drplacementcontrol.ramendr.openshift.io/app-namespace: appset-busybox-3-cephfs
      drplacementcontrol.ramendr.openshift.io/last-app-deployment-cluster: prsurve-5c1
      API Version: ramendr.openshift.io/v1alpha1
      Kind: DRPlacementControl
      Creation Timestamp: 2024-10-16T13:21:33Z
      Generation: 2
      Owner References:
      API Version: cluster.open-cluster-management.io/v1beta1
      Block Owner Deletion: true
      Controller: true
      Kind: Placement
      Name: busybox-3-placement-cephfs
      UID: c08571cd-03c5-46f0-a1c5-4f77bea158fd
      Resource Version: 2853670
      UID: b63d684d-74b4-4c83-83e3-17a9829b5bc9
      Action: Relocate
      Dr Policy Ref:
      API Version: ramendr.openshift.io/v1alpha1
      Kind: DRPolicy
      Name: odr-policy-5m
      Placement Ref:
      API Version: cluster.open-cluster-management.io/v1beta1
      Kind: Placement
      Name: busybox-3-placement-cephfs
      Namespace: openshift-gitops
      Preferred Cluster: prsurve-5c2
      Pvc Selector:
      Match Labels:
      Appname: busybox_app3_cephfs
      Action Start Time: 2024-10-16T13:30:33Z
      Last Transition Time: 2024-10-16T13:30:43Z
      Message: Waiting for App resources to be restored...)
      Observed Generation: 2
      Reason: Relocating
      Status: False
      Type: Available
      Last Transition Time: 2024-10-16T13:34:43Z
      Message: Relocation in progress to cluster "prsurve-5c2"
      Observed Generation: 2
      Reason: NotStarted
      Status: False
      Type: PeerReady
      Last Transition Time: 2024-10-16T13:34:44Z
      Message: VolumeReplicationGroup (appset-busybox-3-cephfs/busybox-3-placement-cephfs-drpc) on cluster prsurve-5c2 is progressing on readying workload data (Not all VolSync PVCs are ready), retrying till DataReady condition is met
      Observed Generation: 2
      Reason: Progressing
      Status: False
      Type: Protected
      Last Group Sync Duration: 36.74055203s
      Last Group Sync Time: 2024-10-16T13:34:34Z
      Last Update Time: 2024-10-16T14:15:48Z
      Observed Generation: 2
      Phase: Relocating
      Preferred Decision:
      Cluster Name: prsurve-5c1
      Cluster Namespace: prsurve-5c1
      Progression: WaitForReadiness
      Resource Conditions:
      Last Transition Time: 2024-10-16T13:34:44Z
      Message: Not all VolSync PVCs are ready
      Observed Generation: 3
      Reason: Progressing
      Status: False
      Type: DataReady
      Last Transition Time: 2024-10-16T13:34:44Z
      Message: Not all VolSync PVCs are protected
      Observed Generation: 3
      Reason: Progressing
      Status: False
      Type: DataProtected
      Last Transition Time: 2024-10-16T13:34:44Z
      Message: Not all VolSync PVCs are protected
      Observed Generation: 3
      Reason: Progressing
      Status: False
      Type: ClusterDataProtected
      Last Transition Time: 2024-10-16T13:34:44Z
      Message: Restored PVs and PVCs
      Observed Generation: 3
      Reason: Restored
      Status: True
      Type: ClusterDataReady
      Resource Meta:
      Generation: 3
      Kind: VolumeReplicationGroup
      Name: busybox-3-placement-cephfs-drpc
      Namespace: appset-busybox-3-cephfs
      Resource Version: 3633777
      Type Reason Age From Message
      ---- ------ ---- ---- -------
      Normal DRPCDeploying 54m (x8 over 54m) controller_DRPlacementControl Deploying the application and VRG
      Normal DRPCDeploySuccess 54m (x8 over 54m) controller_DRPlacementControl Successfully deployed the application and VRG
      Warning unknown state 45m (x14 over 54m) controller_DRPlacementControl next state not known
      Expected results:

      Relocation should happen successfully

      Additional info:

              bmekhiss Benamar Mekhissi
              kramdoss@redhat.com Krishnaram Karthick Ramdoss
              Benamar Mekhissi
              Sidhant Agrawal Sidhant Agrawal
              0 Vote for this issue
              17 Start watching this issue
