Loading...

Type: Bug
Resolution: Done
Priority: Critical
Fix Version/s: odf-4.13.13
Affects Version/s: odf-4.13
Component/s: unclassified
Labels:
None

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
BZ Internal Whiteboard:
4.13.8
Bugzilla Bug:
RHBZ: 2260838
Dev Approval:
Committed
QE Approval:
Committed
Release Note Type:
If docs needed, set a value
Intelligence Requested:
Market:

Target Version:

odf-4.13.13

Regression:
None

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

+++ This bug was initially created as a clone of Bug #2223461 +++

Description of problem (please be detailed as possible and provide log
snippests):

Cu is setting up Multicluster Storage Health on their sandbox cluster and following the instructions in "Chapter 2. Multicluster storage health Red Hat OpenShift Data Foundation 4.12 | Red Hat Customer Portal" [1]

[1] https://access.redhat.com/documentation/en-us/red_hat_openshift_data_foundation/4.12/html/monitoring_openshift_data_foundation/multicluster_storage_health

Configmap `observability-metrics-custom-allowlist` has been added to namespace `open-cluster-management-observability`, however, upon verification, `Data Services` is not visible on the RHACM console.

Version of all relevant components (if applicable):

ODF v4.12.2

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?

Cu cannot move forward with the testing phase.

Is there any workaround available to the best of your knowledge? No.

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)? 3

Can this issue reproducible? No.

Can this issue reproduce from the UI?

If this is a regression, please provide more details to justify this:

Steps to Reproduce:
1. N/A
2.
3.

Actual results:

Expected results:

Additional info:

— Additional comment from on 2023-07-18 01:37:36 UTC —

The configmap `observability-metrics-custom-allowlist` is in namespace `open-cluster-management-observability`:

~~~
[redhat@ch1opnlvdev4 ODF-Monitoring]$ oc get cm -n open-cluster-management-observability

NAME DATA AGE

alertmanager-ca-bundle 1 37d

config-service-cabundle 1 37d

config-trusted-cabundle 1 37d

grafana-dashboard-acm-clusters-overview 1 37d

grafana-dashboard-acm-clusters-overview-ocp311 1 37d

grafana-dashboard-acm-optimization-overview 1 37d

grafana-dashboard-acm-optimization-overview-ocp311 1 37d

grafana-dashboard-cluster-rsrc-use 1 37d

grafana-dashboard-k8s-apiserver 1 37d

grafana-dashboard-k8s-capacity-management-ocp311 1 37d

grafana-dashboard-k8s-compute-resources-cluster 1 37d

grafana-dashboard-k8s-compute-resources-namespace-pods 1 37d

grafana-dashboard-k8s-compute-resources-namespace-pods-ocp311 1 37d

grafana-dashboard-k8s-compute-resources-namespace-workloads 1 37d

grafana-dashboard-k8s-compute-resources-node-pods 1 37d

grafana-dashboard-k8s-compute-resources-pod 1 37d

grafana-dashboard-k8s-compute-resources-pod-ocp311 1 37d

grafana-dashboard-k8s-compute-resources-workload 1 37d

grafana-dashboard-k8s-etcd-cluster 1 37d

grafana-dashboard-k8s-namespaces-in-cluster-ocp311 1 37d

grafana-dashboard-k8s-networking-cluster 1 37d

grafana-dashboard-k8s-pods-in-namespace-ocp311 1 37d

grafana-dashboard-k8s-service-level-overview 1 37d

grafana-dashboard-k8s-service-level-overview-api-server-cluster 1 37d

grafana-dashboard-k8s-summary-by-node-ocp311 1 37d

grafana-dashboard-node-rsrc-use 1 37d

kube-root-ca.crt 1 37d

observability-metrics-allowlist 2 37d

observability-metrics-custom-allowlist 1 10d

observability-observatorium-api 1 37d

observability-thanos-receive-controller-tenants 1 37d

observability-thanos-receive-controller-tenants-generated 1 37d

openshift-service-ca.crt 1 37d

rbac-query-proxy-probe 1 37d

thanos-ruler-config 1 37d

thanos-ruler-default-rules 1 37d
~~~

The yaml used to create the configmap:

~~~
kind: ConfigMap

apiVersion: v1

metadata:

name: observability-metrics-custom-allowlist

Namespace: open-cluster-management-observability

data:

metrics_list.yaml: |

names:

odf_system_health_status

odf_system_map

odf_system_raw_capacity_total_bytes

odf_system_raw_capacity_used_bytes

matches:

_name_="csv_succeeded",exported_namespace="openshift-storage",name=~"odf-operator.*"
~~~

`.../namespaces/open-cluster-management/pods/multicluster-observability-operator-674cbcff85-ndp7c/multicluster-observability-operator/multicluster-observability-operator/logs/rotated/529.log.20230707-230347`

~~~
2023-07-07T22:29:19.345124878+00:00 stderr F 2023-07-07T22:29:19.344Z ERROR controller_placementrule Failed to update monitoring-endpoint-monitoring-work work

{"error": "admission webhook \"manifestworkvalidators.admission.work.open-cluster-management.io\" denied the request: the size of manifests is 58935 bytes which exceeds the 50k limit"}

2023-07-07T22:29:19.345124878+00:00 stderr F github.com/stolostron/multicluster-observability-operator/operators/multiclusterobservability/controllers/placementrule.createManagedClusterRes
2023-07-07T22:29:19.345124878+00:00 stderr F /remote-source/app/operators/multiclusterobservability/controllers/placementrule/placementrule_controller.go:434
2023-07-07T22:29:19.345124878+00:00 stderr F github.com/stolostron/multicluster-observability-operator/operators/multiclusterobservability/controllers/placementrule.createAllRelatedRes
2023-07-07T22:29:19.345124878+00:00 stderr F /remote-source/app/operators/multiclusterobservability/controllers/placementrule/placementrule_controller.go:354
2023-07-07T22:29:19.345124878+00:00 stderr F github.com/stolostron/multicluster-observability-operator/operators/multiclusterobservability/controllers/placementrule.(*PlacementRuleReconciler).Reconcile
2023-07-07T22:29:19.345124878+00:00 stderr F /remote-source/app/operators/multiclusterobservability/controllers/placementrule/placementrule_controller.go:158
2023-07-07T22:29:19.345124878+00:00 stderr F sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
2023-07-07T22:29:19.345124878+00:00 stderr F /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:114
2023-07-07T22:29:19.345124878+00:00 stderr F sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
2023-07-07T22:29:19.345124878+00:00 stderr F /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:311
2023-07-07T22:29:19.345124878+00:00 stderr F sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
2023-07-07T22:29:19.345124878+00:00 stderr F /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266
2023-07-07T22:29:19.345124878+00:00 stderr F sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
2023-07-07T22:29:19.345124878+00:00 stderr F /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227
2023-07-07T22:29:19.345124878+00:00 stderr F 2023-07-07T22:29:19.345Z ERROR controller_placementrule Failed to create manifestwork

{"error": "admission webhook \"manifestworkvalidators.admission.work.open-cluster-management.io\" denied the request: the size of manifests is 58935 bytes which exceeds the 50k limit"}

2023-07-07T22:29:19.345124878+00:00 stderr F github.com/stolostron/multicluster-observability-operator/operators/multiclusterobservability/controllers/placementrule.(*PlacementRuleReconciler).Reconcile
2023-07-07T22:29:19.345124878+00:00 stderr F /remote-source/app/operators/multiclusterobservability/controllers/placementrule/placementrule_controller.go:158
2023-07-07T22:29:19.345124878+00:00 stderr F sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
2023-07-07T22:29:19.345124878+00:00 stderr F /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:114
2023-07-07T22:29:19.345124878+00:00 stderr F sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
2023-07-07T22:29:19.345124878+00:00 stderr F /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:311
2023-07-07T22:29:19.345124878+00:00 stderr F sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
2023-07-07T22:29:19.345124878+00:00 stderr F /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266
2023-07-07T22:29:19.345124878+00:00 stderr F sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
2023-07-07T22:29:19.345124878+00:00 stderr F /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227
2023-07-07T22:29:19.345124878+00:00 stderr F 2023-07-07T22:29:19.345Z ERROR controller_placementrule Failed to create managedcluster resources

{"namespace": "local-cluster", "error": "admission webhook \"manifestworkvalidators.admission.work.open-cluster-management.io\" denied the request: the size of manifests is 58935 bytes which exceeds the 50k limit"}

2023-07-07T22:29:19.345124878+00:00 stderr F sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
2023-07-07T22:29:19.345124878+00:00 stderr F /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:114
2023-07-07T22:29:19.345124878+00:00 stderr F sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
2023-07-07T22:29:19.345124878+00:00 stderr F /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:311
2023-07-07T22:29:19.345124878+00:00 stderr F sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
2023-07-07T22:29:19.345124878+00:00 stderr F /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266
2023-07-07T22:29:19.345124878+00:00 stderr F sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
2023-07-07T22:29:19.345124878+00:00 stderr F /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227
2023-07-07T22:29:19.345124878+00:00 stderr F 2023-07-07T22:29:19.345Z INFO controller_placementrule Monitoring operator should be installed in cluster

{"cluster_name": "nyocplab1", "request.name": "mch-updated-request", "request.namespace": "open-cluster-management"}

2023-07-07T22:29:19.345179625+00:00 stderr F 2023-07-07T22:29:19.345Z INFO controller_placementrule observabilityaddon already existed/unchanged

{"namespace": "nyocplab1"}

2023-07-07T22:29:19.345179625+00:00 stderr F 2023-07-07T22:29:19.345Z INFO controller_placementrule clusterrolebinding endpoint-observability-mco-rolebinding already existed/unchanged

{"namespace": "nyocplab1"}

2023-07-07T22:29:19.345179625+00:00 stderr F 2023-07-07T22:29:19.345Z INFO controller_placementrule rolebinding endpoint-observability-res-rolebinding already existed/unchanged

{"namespace": "nyocplab1"}

2023-07-07T22:29:19.345251693+00:00 stderr F 2023-07-07T22:29:19.345Z INFO controller_placementrule Updating manifestwork

{"nyocplab1": "nyocplab1", "name": "nyocplab1-observability"}

~~~

— Additional comment from RHEL Program Management on 2023-07-18 01:37:44 UTC —

This bug having no release flag set previously, is now set with release flag 'odf‑4.14.0' to '?', and so is being proposed to be fixed at the ODF 4.14.0 release. Note that the 3 Acks (pm_ack, devel_ack, qa_ack), if any previously set while release flag was missing, have now been reset since the Acks are to be set against a release flag.

— Additional comment from gowtham on 2023-07-24 12:35:37 UTC —

I am doubting ACM observability has some issues, Please check the MultiClusterObservability status is ready(it means ACM observability is ready and healthy)
oc get MultiClusterObservability observability -o jsonpath='

{.status.conditions}

'

The result of the above command should have "message:Observability components are deployed and running reason:Ready status:True type:Ready"

— Additional comment from Joydeep Banerjee on 2023-07-24 13:36:24 UTC —

Can you check if we see the metrics like:

odf_system_health_status
odf_system_map
odf_system_raw_capacity_total_bytes
odf_system_raw_capacity_used_bytes
in the ACM Grafana explorer. The Grafana is https://grafana-open-cluster-management-observability.[openshift_ingress_domain]/. Then click on the EXPLORER icon on the left hand bar. And then type in the metric name and see if you see data.
If data for all metrics can be seen, then it is not an issue with ACM.
If you do not see data, please run the ACM must-gather. - https://access.redhat.com/documentation/en-us/red_hat_advanced_cluster_management_for_kubernetes/2.8/html/troubleshooting/troubleshooting#procedure

— Additional comment from Red Hat Bugzilla on 2023-08-03 08:31:20 UTC —

Account disabled by LDAP Audit

— Additional comment from on 2023-08-04 01:48:48 UTC —

Thanks gowtham, joydeep.

Cu has confirmed that they can search the ODF metrics on Grafana [1]. However, they did not confirm if any data is available.

They did confirm, though, that after installing Multicluster Orchestrator 4.12 operator with plugin for console enabled, they are able to see Data Services on HUB console. They also said they have added feedback that this operator should be included in the prerequisites for setting up Multicluster Storage Health dashboard.

Can the documentation [2] be updated to include this prerequisite?

[1] supportshell:/cases/03557262/0050-grafana-screenshot.png

[2] https://access.redhat.com/documentation/en-us/red_hat_openshift_data_foundation/4.12/html/monitoring_openshift_data_foundation/multicluster_storage_health#enabling-multicluster-dashboard-on-hub-cluster_rhodf

— Additional comment from Sunil Kumar Acharya on 2023-09-01 08:14:24 UTC —

We have dev freeze of ODF-4.14.0 on 04-SEP-2023. Since this BZ has not been approved and is not marked as blocker/exception, it will be moved out to ODF-4.15.0 on 04-SEP-2023.

If you think this BZ should be considered as an exception/blocker feel free to set the flag with justification note. Also, please mention the estimated date by which this BZ can be moved to MODIFIED state.

— Additional comment from Sunil Kumar Acharya on 2023-09-12 06:21:42 UTC —

ODF-4.14 has entered 'blocker only' phase on 12-SEP-2023. Hence, moving the non-blocker BZs to ODF-4.15. If you think this BZ needs to be evaluated for ODF-4.14, please feel free to propose the BZ as a blocker/exception to ODF-4.14 with a justification note.

— Additional comment from gowtham on 2024-01-08 11:47:02 UTC —

ack, i will inform the document team to update the doc

— Additional comment from RHEL Program Management on 2024-01-17 15:50:18 UTC —

The 'Target Release' is not to be set manually at the Red Hat OpenShift Data Foundation product.

The 'Target Release' will be auto set appropriately, after the 3 Acks (pm,devel,qa) are set to "" for a specific release flag and that release flag gets auto set to "".

— Additional comment from RHEL Program Management on 2024-01-25 10:58:35 UTC —

This BZ is being approved for an ODF 4.12.z z-stream update, upon receipt of the 3 ACKs (PM,Devel,QA) for the release flag 'odf‑4.12.z', and having been marked for an approved z-stream update

— Additional comment from RHEL Program Management on 2024-01-25 10:58:35 UTC —

Since this bug has been approved for ODF 4.12.11 release, through release flag 'odf-4.12.z+', and appropriate update number entry at the 'Internal Whiteboard', the Target Release is being set to 'ODF 4.12.11'

— Additional comment from Olive Lakra on 2024-01-29 08:24:26 UTC —

Hi Gowtham

does this request apply to 4.13, 4.14 & 4.15 apart from 4.12?

— Additional comment from Olive Lakra on 2024-01-29 08:50:47 UTC —

doc updated. Added following bullet point in the prereq:

----------------------------------------------------------

Ensure that you have installed Multicluster Orchestrator 4.12 operator with plugin for console enabled.

----------------------------------------------------------

Staging url for review: https://dxp-docp-prod.apps.ext-waf.spoke.prod.us-west-2.aws.paas.redhat.com/documentation/en-us/red_hat_openshift_data_foundation/4.12/html-single/monitoring_openshift_data_foundation/index?lb_target=stage#enabling-multicluster-dashboard-on-hub-cluster_rhodf

— Additional comment from gowtham on 2024-01-29 10:00:41 UTC —

This change is applicable for 4.12, 4,13, 4,14 and 4,15

— Additional comment from gowtham on 2024-01-29 10:02:12 UTC —

it's up to the document team to decide on which version to backport this change.

is blocked by

DFBUGS-770 [2223461] [GSS] "Data Services" not visible after setting up Multicluster storage health, error "Failed to update monitoring-endpoint-monitoring-work work", "he size of manifests is 58935 bytes which exceeds the 50k limit"

Closed

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty