-
Bug
-
Resolution: Done
-
Blocker
-
ACM 2.9.0
-
True
-
DR monitoring dashboard could not be validated.
-
False
-
-
-
Critical
-
No
Description of problem:
Version-Release number of selected component (if applicable):
ODF 4.14.0-150.stable
ACM 2.9.0-DOWNSTREAM-2023-10-12-14-53-11
advanced-cluster-management.v2.9.0-187
OCP 4.14.0-0.nightly-2023-10-14-061428
How reproducible:
Steps to Reproduce:
1. Configure a Regional DR setup with OCP and ODF 4.14 and deploy rbd based workloads on it.
2. Enable DR monitoring dashboard and check if we have values on the graphs, cluster operator status, etc.
3. To configure dashboard, create MultiClusterObservability on the RHACM hub cluster.
apiVersion: observability.open-cluster-management.io/v1beta2
kind: MultiClusterObservability
metadata:
name: observability
spec:
enableDownsampling: true
observabilityAddonSpec:
enableMetrics: true
interval: 60
storageConfig:
alertmanagerStorageSize: 1Gi
compactStorageSize: 100Gi
metricObjectStorage:
key: thanos.yaml
name: thanos-object-storage
receiveStorageSize: 100Gi
ruleStorageSize: 1Gi
storageClass: thin-csi-odf
storeStorageSize: 10Gi
4. Create thanos object storage yaml using a Noobaa OBC from one of the managed clusters where ODF is installed.
apiVersion: v1
kind: Secret
metadata:
name: thanos-object-storage
namespace: open-cluster-management-observability
type: Opaque
stringData:
thanos.yaml: |
type: s3
config:
bucket: observability-bucket-******
endpoint: s3-openshift-storage.apps.********.qe.rh-ocs.com
insecure: true
access_key: ***************
secret_key: ****************
5. Run on hub
oc get MultiClusterObservability observability -o jsonpath='
' on RHACM hub cluster --> should return True in a few mins
6. Run on hub
kubectl label namespace openshift-operators openshift.io/cluster-monitoring='true'
7. Create below yaml to whitelist DR metrics on RHACM Hub
kind: ConfigMap
apiVersion: v1
metadata:
name: observability-metrics-custom-allowlist
namespace: open-cluster-management-observability
data:
metrics_list.yaml: |
names:
- odf_system_health_status
- odf_system_map
- odf_system_raw_capacity_total_bytes
- odf_system_raw_capacity_used_bytes
- ceph_rbd_mirror_snapshot_sync_bytes
- ceph_rbd_mirror_snapshot_snapshots
matches: - _name_="csv_succeeded",exported_namespace="openshift-storage",name=~"odf-operator.*"
- _name_="csv_succeeded",exported_namespace="openshift-dr-system",name=~"odr-cluster-operator.*"
- _name_="csv_succeeded",exported_namespace="openshift-operators",name=~"volsync.*"
recording_rules: - record: count_persistentvolumeclaim_total
expr: count(kube_persistentvolumeclaim_info)
8. Hard refresh RHACM console of Hub and navigate to Data policies page to see the dashboard.