Uploaded image for project: 'Data Foundation Bugs'
  1. Data Foundation Bugs
  2. DFBUGS-378

[2259033] [MDR] Drcluster annotations doc need to be more generic

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • None
    • odf-4.14
    • odf-dr/ramen
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • ?
    • ?
    • If docs needed, set a value
    • None

      Description of problem (please be detailed as possible and provide log
      snippests):
      Following the docs to set drcluster annotations:
      https://access.redhat.com/documentation/en-us/red_hat_openshift_data_foundation/4.14/html/configuring_openshift_data_foundation_disaster_recovery_for_openshift_workloads/metro-dr-solution#add-fencing-annotations-to-drclusters_mdr

      The instructions use a generic secret name for all drclusters:
      drcluster.ramendr.openshift.io/storage-secret-name: rook-csi-rbd-provisioner

      However this caused Fencing of the primary cluster to immediately fail in my case:

      1. oc describe NetworkFence -A --context dc2
        Name: network-fence-dc1
        [...]
        Secret:
        Name: rook-csi-rbd-provisioner
        Namespace: openshift-storage
        Status:
        Message: rpc error: code = InvalidArgument desc = secrets "rook-csi-rbd-provisioner" not found

      I could only get to a Fenced state by editing drcluster dc1 to use the dc2 secret name:

      1. oc --context dc1 get secret -A | grep rook-csi-rbd-provisioner
        openshift-storage rook-csi-rbd-provisioner-cluster1-rbdpool Opaque [...]
      2. oc --context dc2 get secret -A | grep rook-csi-rbd-provisioner
        openshift-storage rook-csi-rbd-provisioner-cluster2-rbdpool Opaque [...]
      1. oc edit drcluster dc1
        1. I configured dc2 secret name:
          drcluster.ramendr.openshift.io/storage-secret-name: rook-csi-rbd-provisioner-cluster2-rbdpool

      Note, I think this is because when I ran the ceph-external-cluster-details-exporter.py script to configure the connection to the external RHCS, I used "--cluster-name cluster[1|2] --restricted-auth-permission true" to differentiate between the two clusters as mentioned in:
      https://access.redhat.com/documentation/en-us/red_hat_openshift_data_foundation/4.14/html/deploying_openshift_data_foundation_in_external_mode/deploy-openshift-data-foundation-using-red-hat-ceph-storage#creating-an-openshift-data-foundation-cluster-service-for-external-storage_ceph-external

      This is because if I used "--run-as-user client.odf.cluster1" as some docs suggested I got this error in the ceph admin journal:
      ceph-mon[16028]: cephx server client.odf.cluster1: couldn't find entity name: client.odf.cluster1

      So if it is a valid option to use --cluster-name when configuring an external cluster, it seems that the ramendr storage-secret-name should either be generic so all clusters can reference the same annotation, or if that is not possible the docs should mention that annotation on the primary drcluster should reference the secondary drcluster secret name (and vise versa?)?

      Version of all relevant components (if applicable):
      OCP 4.14.6
      4.14.3-rhodf
      RHCS 6.1 on RHEL 9.2 nodes

      Does this issue impact your ability to continue to work with the product
      (please explain in detail what is the user impact)?
      The procedure to debug the Fencing failure was not clear at first, but I have recovered.

      Is there any workaround available to the best of your knowledge?
      Yes I worked around by changing the annotation.

      Rate from 1 - 5 the complexity of the scenario you performed that caused this
      bug (1 - very simple, 5 - very complex)?
      4

      Can this issue reproducible?
      Yes

      Can this issue reproduce from the UI?
      Yes

      If this is a regression, please provide more details to justify this:
      No

      Steps to Reproduce:
      1. Create external ceph connections using --cluster-name
      2. Follow DR annotation docs
      3. Fencing errors will occur

      Actual results:
      Fencing error due to generic secret name

      Expected results:
      Clear docs

              rtalur@redhat.com Raghavendra Talur
              jhopper@redhat.com Jenifer Abrams
              Harish Nallur Vittal Rao Harish Nallur Vittal Rao
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

                Created:
                Updated: