Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-15632

Custom dashboards sometimes fail to load

XMLWordPrintable

    • 1
    • False
    • None
    • False
    • Obs Sprint 36, Observability Sprint 37, Observability Sprint 38
    • Moderate
    • None

      Description of problem:

      Sometimes custom dashboard fails to load.

      Before creating a custom dashboard, in the dashboard loader, we create a dashboard folder in Grafana. While creating the actual folder seems to work fine, setting of the permissions sometimes fail:

      grafana pod logs:

      logger=http.server
       t=2024-10-03T07:51:14.819481662Z level=error msg="Could not set the 
      default folder permissions" folder=Custom user="unsupported value type" 
      error="rolling back transaction due to error failed: cannot rollback - 
      no transaction is active: disk I/O error: permission denied"
      

      As a result, if we add a dashboard to this folder, it cannot be seen, since the permissions are not correct.

      In ACM-14639, as a workaround, we detect whether the permissions are correctly set, and then retries the process of adding the dashboard. This probably works most of the time eventually, however we still see failures of the custom dashboards being loaded.

      Ideally we should have a more robust solution.

      Version-Release number of selected component (if applicable):

      ACM 2.12

      How reproducible:

      Sometimes.

      Steps to Reproduce:

      1. Create a custom dashboard (for example https://github.com/stolostron/multicluster-observability-operator/blob/main/examples/dashboards/sample_custom_dashboard/custom-sample-dashboard.yaml)
      2. Check the logs for the dashboard loader pod, see that the custom dashboard has problems being created
      3. Check Grafana and see that the dashboard is missing.

      Actual results:

      Dashboard not always created

      Expected results:

      Dashboard always created

      Additional info:

      I tested this with upstream Grafana image with the same results. We have looked and talked to the community and there are no other reports of this.

      Have tested using their golang SDK instead of us calling their HTTP api manually

              rh-ee-tmange Thibault Mange
              rh-ee-jachanse Jacob Baungard Hansen
              Xiang Yin Xiang Yin
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: