Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-8232

[RDR] Graphs are empty and cluster operator is degraded as DR metrics don't work with ACM observability

XMLWordPrintable

    • True
    • DR monitoring dashboard could not be validated.
    • False
    • Critical
    • No

      Description of problem:

      Version-Release number of selected component (if applicable):

      ODF 4.14.0-150.stable
      ACM 2.9.0-DOWNSTREAM-2023-10-12-14-53-11
      advanced-cluster-management.v2.9.0-187
      OCP 4.14.0-0.nightly-2023-10-14-061428

      How reproducible:

      Steps to Reproduce:

      1. Configure a Regional DR setup with OCP and ODF 4.14 and deploy rbd based workloads on it.
      2. Enable DR monitoring dashboard and check if we have values on the graphs, cluster operator status, etc.
      3. To configure dashboard, create MultiClusterObservability on the RHACM hub cluster.
      apiVersion: observability.open-cluster-management.io/v1beta2
      kind: MultiClusterObservability
      metadata:
      name: observability
      spec:
      enableDownsampling: true
      observabilityAddonSpec:
      enableMetrics: true
      interval: 60
      storageConfig:
      alertmanagerStorageSize: 1Gi
      compactStorageSize: 100Gi
      metricObjectStorage:
      key: thanos.yaml
      name: thanos-object-storage
      receiveStorageSize: 100Gi
      ruleStorageSize: 1Gi
      storageClass: thin-csi-odf
      storeStorageSize: 10Gi

      4. Create thanos object storage yaml using a Noobaa OBC from one of the managed clusters where ODF is installed.
      apiVersion: v1
      kind: Secret
      metadata:
      name: thanos-object-storage
      namespace: open-cluster-management-observability
      type: Opaque
      stringData:
      thanos.yaml: |
      type: s3
      config:
      bucket: observability-bucket-******
      endpoint: s3-openshift-storage.apps.********.qe.rh-ocs.com
      insecure: true
      access_key: ***************
      secret_key: ****************
      5. Run on hub
      oc get MultiClusterObservability observability -o jsonpath='

      {.status.conditions[1].status}

      ' on RHACM hub cluster --> should return True in a few mins
      6. Run on hub
      kubectl label namespace openshift-operators openshift.io/cluster-monitoring='true'
      7. Create below yaml to whitelist DR metrics on RHACM Hub
      kind: ConfigMap
      apiVersion: v1
      metadata:
      name: observability-metrics-custom-allowlist
      namespace: open-cluster-management-observability
      data:
      metrics_list.yaml: |
      names:

      • odf_system_health_status
      • odf_system_map
      • odf_system_raw_capacity_total_bytes
      • odf_system_raw_capacity_used_bytes
      • ceph_rbd_mirror_snapshot_sync_bytes
      • ceph_rbd_mirror_snapshot_snapshots
        matches:
      • _name_="csv_succeeded",exported_namespace="openshift-storage",name=~"odf-operator.*"
      • _name_="csv_succeeded",exported_namespace="openshift-dr-system",name=~"odr-cluster-operator.*"
      • _name_="csv_succeeded",exported_namespace="openshift-operators",name=~"volsync.*"
        recording_rules:
      • record: count_persistentvolumeclaim_total
        expr: count(kube_persistentvolumeclaim_info)
        8. Hard refresh RHACM console of Hub and navigate to Data policies page to see the dashboard.

      Actual results: Graphs are empty and cluster operator is degraded as DR metrics don't work with ACM observability

      Expected results: Graphs shouldn't be empty if rbd workloads are running and cluster operator should be healthy.

      Additional info:

            smeduri1@redhat.com Subbarao Meduri
            amagrawa@redhat.com Aman Agrawal
            Hui Chen Hui Chen
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: