Uploaded image for project: 'Data Foundation Bugs'
  1. Data Foundation Bugs
  2. DFBUGS-79

[2232673] [RFE][RDR] VR and VRG status conditions do not reflect cephrbd image health

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False
    • ?
    • 4.18.0-107
    • ?
    • Release Note Not Required
    • RamenDR sprint 2024 #21
    • None

      Description of problem (please be detailed as possible and provide log
      snippests):
      The VR and VRG can have correct (healthy) status conditions when the image Peer_site state is NOT up+Replaying (i.e up+starting_replay).

      Example for busybox VRG, VR and associated cephrbd image:

      $ oc get vrg busybox-placement-1-drpc -n busybox-sample -o jsonpath='

      {.status.conditions}' | jq
      [
      { "lastTransitionTime": "2023-08-17T16:06:31Z", "message": "PVCs in the VolumeReplicationGroup are ready for use", "observedGeneration": 1, "reason": "Ready", "status": "True", "type": "DataReady" },
      { "lastTransitionTime": "2023-08-17T16:06:08Z", "message": "VolumeReplicationGroup is replicating", "observedGeneration": 1, "reason": "Replicating", "status": "False", "type": "DataProtected" },
      { "lastTransitionTime": "2023-08-17T16:06:07Z", "message": "Restored cluster data", "observedGeneration": 1, "reason": "Restored", "status": "True", "type": "ClusterDataReady" },
      { "lastTransitionTime": "2023-08-17T16:06:08Z", "message": "Cluster data of all PVs are protected", "observedGeneration": 1, "reason": "Uploaded", "status": "True", "type": "ClusterDataProtected" }
      ]

      $ oc get vr busybox-pvc -n busybox-sample -o jsonpath='{.status.conditions}

      ' | jq
      [

      { "lastTransitionTime": "2023-08-17T16:06:31Z", "message": "", "observedGeneration": 1, "reason": "Promoted", "status": "True", "type": "Completed" }

      ,

      { "lastTransitionTime": "2023-08-17T16:06:31Z", "message": "", "observedGeneration": 1, "reason": "Healthy", "status": "False", "type": "Degraded" }

      ,

      { "lastTransitionTime": "2023-08-17T16:06:31Z", "message": "", "observedGeneration": 1, "reason": "NotResyncing", "status": "False", "type": "Resyncing" }

      ]

      $ oc get vr busybox-pvc -n busybox-sample -o jsonpath='

      {.status.conditions}

      ' | jq
      [

      { "lastTransitionTime": "2023-08-17T16:06:31Z", "message": "", "observedGeneration": 1, "reason": "Promoted", "status": "True", "type": "Completed" }

      ,

      { "lastTransitionTime": "2023-08-17T16:06:31Z", "message": "", "observedGeneration": 1, "reason": "Healthy", "status": "False", "type": "Degraded" }

      ,

      { "lastTransitionTime": "2023-08-17T16:06:31Z", "message": "", "observedGeneration": 1, "reason": "NotResyncing", "status": "False", "type": "Resyncing" }

      ]

      $ rbd -p ocs-storagecluster-cephblockpool mirror image status csi-vol-c8bc0681-76f1-4f7c-8866-2c6e47372276
      csi-vol-c8bc0681-76f1-4f7c-8866-2c6e47372276:
      global_id: f7362123-b264-48ee-85a3-7fb30b8f0e08
      state: up+stopped
      description: local image is primary
      service: a on bos5-zwmb8-ocs-0-nmd2n
      last_update: 2023-08-17 20:37:56
      peer_sites:
      name: 981d7c1f-0ab5-4ed1-a91b-050586b08ab8
      state: up+starting_replay
      description: starting replay
      last_update: 2023-08-17 20:37:56
      snapshots:
      10 .mirror.primary.f7362123-b264-48ee-85a3-7fb30b8f0e08.b81c40e6-3b52-4db0-ae5b-3df80855a7a4 (peer_uuids:[2fca7624-ef09-4ea4-961d-629af99fd6c0])
      11 .mirror.primary.f7362123-b264-48ee-85a3-7fb30b8f0e08.061cbbae-a897-4ed6-a6c6-a4b6e70cc758 (peer_uuids:[2fca7624-ef09-4ea4-961d-629af99fd6c0])
      12 .mirror.primary.f7362123-b264-48ee-85a3-7fb30b8f0e08.a84ca191-fc01-4a20-a4ee-6c2c17a5cd67 (peer_uuids:[2fca7624-ef09-4ea4-961d-629af99fd6c0])
      13 .mirror.primary.f7362123-b264-48ee-85a3-7fb30b8f0e08.b57f9475-5a13-4d8e-a18c-056713bb1308 (peer_uuids:[2fca7624-ef09-4ea4-961d-629af99fd6c0])
      113 .mirror.primary.f7362123-b264-48ee-85a3-7fb30b8f0e08.b9499d87-2d7a-46f6-97ac-2decb58885a9 (peer_uuids:[2fca7624-ef09-4ea4-961d-629af99fd6c0])

      $ rbd -p ocs-storagecluster-cephblockpool mirror pool status --verbose
      health: WARNING
      daemon health: OK
      image health: WARNING
      images: 1 total
      1 starting_replay

      DAEMONS
      service 15646:
      instance_id: 15652
      client_id: a
      hostname: bos5-zwmb8-ocs-0-nmd2n
      version: 17.2.6-70.el9cp
      leader: true
      health: OK

      IMAGES
      csi-vol-c8bc0681-76f1-4f7c-8866-2c6e47372276:
      global_id: f7362123-b264-48ee-85a3-7fb30b8f0e08
      state: up+stopped
      description: local image is primary
      service: a on bos5-zwmb8-ocs-0-nmd2n
      last_update: 2023-08-17 20:38:56
      peer_sites:
      name: 981d7c1f-0ab5-4ed1-a91b-050586b08ab8
      state: up+starting_replay
      description: starting replay
      last_update: 2023-08-17 20:38:56

      Version of all relevant components (if applicable):
      $ oc version
      Client Version: 4.14.0-ec.4
      Kustomize Version: v5.0.1
      Server Version: 4.13.6
      Kubernetes Version: v1.26.6+73ac561

      ManagedCluster:
      $ oc get csv -n openshift-storage
      NAME DISPLAY VERSION REPLACES PHASE
      mcg-operator.v4.13.1-rhodf NooBaa Operator 4.13.1-rhodf mcg-operator.v4.13.0-rhodf Succeeded
      ocs-operator.v4.13.1-rhodf OpenShift Container Storage 4.13.1-rhodf ocs-operator.v4.13.0-rhodf Succeeded
      odf-csi-addons-operator.v4.13.1-rhodf CSI Addons 4.13.1-rhodf odf-csi-addons-operator.v4.13.0-rhodf Succeeded
      odf-operator.v4.13.1-rhodf OpenShift Data Foundation 4.13.1-rhodf odf-operator.v4.13.0-rhodf Succeeded
      odr-cluster-operator.v4.13.1-rhodf Openshift DR Cluster Operator 4.13.1-rhodf odr-cluster-operator.v4.13.0-rhodf Succeeded
      volsync-product.v0.7.4 VolSync 0.7.4 volsync-product.v0.7.3 Succeeded

      Hub Cluster:
      $ oc get csv -n openshift-operators
      NAME DISPLAY VERSION REPLACES PHASE
      odf-multicluster-orchestrator.v4.13.1-rhodf ODF Multicluster Orchestrator 4.13.1-rhodf odf-multicluster-orchestrator.v4.13.0-rhodf Succeeded
      odr-hub-operator.v4.13.1-rhodf Openshift DR Hub Operator 4.13.1-rhodf odr-hub-operator.v4.13.0-rhodf Succeeded

      $ oc get csv -n open-cluster-management
      NAME DISPLAY VERSION REPLACES PHASE
      advanced-cluster-management.v2.8.0 Advanced Cluster Management for Kubernetes 2.8.0

      Does this issue impact your ability to continue to work with the product
      (please explain in detail what is the user impact)?

      Is there any workaround available to the best of your knowledge?
      No

      Rate from 1 - 5 the complexity of the scenario you performed that caused this
      bug (1 - very simple, 5 - very complex)?
      3

      Can this issue reproducible?
      Yes

      Steps to Reproduce:
      1. Configure RDR and create busybox application on managed cluster.
      2. Remove Submariner connectivity
      3. Assign DRPolicy to busybox app
      4. Install Submariner again

      Actual results:
      Image replication does not start and VR and VRG status conditions do not reflect any problems with replication.

      Expected results:
      Image replication does not start and VR and VRG status conditions do reflect a image replication problem.

              ypadia@redhat.com Yati Padia
              aclewett Annette Clewett
              Elena Gershkovich, Yati Padia
              Aman Agrawal Aman Agrawal
              Votes:
              0 Vote for this issue
              Watchers:
              21 Start watching this issue

                Created:
                Updated:
                Resolved: