Uploaded image for project: 'Red Hat OpenShift Data Science'
  1. Red Hat OpenShift Data Science
  2. RHODS-4797

Alerts "User notebook pvc usage above 90%" and "User notebook pvc usage above 100%" are not fired on failure


    • False
    • None
    • False
    • Release Notes
    • Yes
    • No
    • PVC usage limit alerts were not sent when usage exceeded 90% and 100%:: Alerts indicating when a PVC exceeded 90% and 100% of its capacity failed to be triggered and sent.
    • Documented as Resolved Issue
    • No
    • Yes
    • None
    • RHODS 1.15
    • Urgent

      Description of problem:

      Installing the new version of RHODS we found that some tests are failing because the Alert "User notebook pvc usage above 90%" and "User notebook pvc usage above 100%" are not triggered after performing the tests.

      Taking a look at what is happening, we found that the first part of the expression in those alerts 


      does not return any value so when we wrote the expression


      we found that the label prometheus_replica="prometheus-k8s-0" does not match with the actual value returned by the expression above:

      label prometheus_replica="prometheus-k8s-1"


      We found that issue in the Addon installations of the new RHODS version during an upgrade and during a fresh installation

      Prerequisites (if any, like setup, operators/versions):


      Steps to Reproduce

      1. Install RHODS
      2. Execute the test 
        Verify Alert RHODS-PVC-Usage-Above-90 Is Fired When User PVC Is Above 90 Percent

        or manually run in a notebook the script https://github.com/redhat-rhods-qe/ods-ci-notebooks-main/blob/main/notebooks/200__monitor_and_manage/203__alerts/notebook-pvc-usage/fill-notebook-pvc-to-complete-100.ipynb https://github.com/redhat-rhods-qe/ods-ci-notebooks-main/blob/main/notebooks/200__monitor_and_manage/203__alerts/notebook-pvc-usage/fill-notebook-pvc-over-90.ipynb

      3.  Review alert "User notebook pvc usage above 90%" and see that it is not triggered after 2 minutes

      Actual results:

      Alerts are not triggered after a pvc usage above 90%

      Expected results:

      Alerts are triggered

      Reproducibility (Always/Intermittent/Only Once):

      We found this issue in 4 clusters with the addon installed. In a cluster with the catalogsource installation the alert was triggered

      Build Details:

      RHODS v1140-3 in stage


      Additional info:

            vhire Vaishnavi Hire
            pablo-rhods Pablo Felix (Inactive)
            Pablo Felix Pablo Felix (Inactive)
            0 Vote for this issue
            10 Start watching this issue