Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Blocker
Fix Version/s: ACM 2.9.0
Affects Version/s: ACM 2.9.0
Component/s: Observability, Server Foundation
Labels:
- Obs-Core
- RDR-Blocker

Blocked:
True
Blocked Reason:
DR monitoring dashboard could not be validated.
Ready:
False
Intelligence Requested:
Market:

Severity:
Critical

Regression:
No

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

Description of problem:

Version-Release number of selected component (if applicable):

ODF 4.14.0-150.stable
ACM 2.9.0-DOWNSTREAM-2023-10-12-14-53-11
advanced-cluster-management.v2.9.0-187
OCP 4.14.0-0.nightly-2023-10-14-061428

How reproducible:

Steps to Reproduce:

1. Configure a Regional DR setup with OCP and ODF 4.14 and deploy rbd based workloads on it.
2. Enable DR monitoring dashboard and check if we have values on the graphs, cluster operator status, etc.
3. To configure dashboard, create MultiClusterObservability on the RHACM hub cluster.
apiVersion: observability.open-cluster-management.io/v1beta2
kind: MultiClusterObservability
metadata:
name: observability
spec:
enableDownsampling: true
observabilityAddonSpec:
enableMetrics: true
interval: 60
storageConfig:
alertmanagerStorageSize: 1Gi
compactStorageSize: 100Gi
metricObjectStorage:
key: thanos.yaml
name: thanos-object-storage
receiveStorageSize: 100Gi
ruleStorageSize: 1Gi
storageClass: thin-csi-odf
storeStorageSize: 10Gi

4. Create thanos object storage yaml using a Noobaa OBC from one of the managed clusters where ODF is installed.
apiVersion: v1
kind: Secret
metadata:
name: thanos-object-storage
namespace: open-cluster-management-observability
type: Opaque
stringData:
thanos.yaml: |
type: s3
config:
bucket: observability-bucket-******
endpoint: s3-openshift-storage.apps.********.qe.rh-ocs.com
insecure: true
access_key: ***************
secret_key: ****************
5. Run on hub
oc get MultiClusterObservability observability -o jsonpath='

{.status.conditions[1].status}

' on RHACM hub cluster --> should return True in a few mins
6. Run on hub
kubectl label namespace openshift-operators openshift.io/cluster-monitoring='true'
7. Create below yaml to whitelist DR metrics on RHACM Hub
kind: ConfigMap
apiVersion: v1
metadata:
name: observability-metrics-custom-allowlist
namespace: open-cluster-management-observability
data:
metrics_list.yaml: |
names:

odf_system_health_status
odf_system_map
odf_system_raw_capacity_total_bytes
odf_system_raw_capacity_used_bytes
ceph_rbd_mirror_snapshot_sync_bytes
ceph_rbd_mirror_snapshot_snapshots
matches:
_name_="csv_succeeded",exported_namespace="openshift-storage",name=~"odf-operator.*"
_name_="csv_succeeded",exported_namespace="openshift-dr-system",name=~"odr-cluster-operator.*"
_name_="csv_succeeded",exported_namespace="openshift-operators",name=~"volsync.*"
recording_rules:
record: count_persistentvolumeclaim_total
expr: count(kube_persistentvolumeclaim_info)
8. Hard refresh RHACM console of Hub and navigate to Data policies page to see the dashboard.

Actual results: Graphs are empty and cluster operator is degraded as DR metrics don't work with ACM observability

Expected results: Graphs shouldn't be empty if rbd workloads are running and cluster operator should be healthy.

Additional info:

Assignee:: Subbarao Meduri

Reporter:: Aman Agrawal

QA Contact:: Hui Chen

Votes:: 0 Vote for this issue

Watchers:: 9 Start watching this issue

Created:: 2023/10/18 10:06 AM

Updated:: 2023/10/25 3:54 PM

Resolved:: 2023/10/20 9:37 AM

Details

Description

Description of problem:

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:

Actual results: Graphs are empty and cluster operator is degraded as DR metrics don't work with ACM observability

Expected results: Graphs shouldn't be empty if rbd workloads are running and cluster operator should be healthy.

Additional info:

Attachments

Easy Agile Planning Poker

Activity

People

Dates