Uploaded image for project: 'Red Hat OpenShift Data Science'
  1. Red Hat OpenShift Data Science
  2. RHODS-1390

Prometheus not working. Pod status is CrashLoopBackOff


    • False
    • False
    • No
    • No
    • Undefined
    • Yes
    • Yes
    • None

      Description of problem:

      We have installed RHODS 1.0.15 in two PSI clusters (mod-qe-1 and mod-qe4) using our script.

      After the installation, prometheus is not available and the status of the pod is CrashLoopBackOff  (see attached images)

      This bug is a test blocker for us, as we need to test prometheus metrics

      We believe this is a bug in rhods 1.0.15, but it could be a bug in our installer script. Could you verify if you have the same behavior in your clusters?



      Prerequisites (if any, like setup, operators/versions):

      Steps to Reproduce

      1. Install RHODS 1.0.15 in mod-qe-4 running the rhods-smoke pipeline following the instructions in ODS Smoke Suite for Checking Build Readiness
      2. Once the installation is finished, go to https://prometheus-redhat-ods-monitoring.apps.modh-qe-4.dev.datahub.redhat.com/ and verify that "Application is not available"
      3. Ask QE team for kubeadmin credentials for mod-qe-4
      4. Go to Workloads > Pods and select project redjat-osd-monitoring
      5. Verify that pod prometheus-xxxxx has status CrashLoopBackOff

      Actual results:

      Expected results:

      prometheus application should be available


      Reproducibility (Always/Intermittent/Only Once):

      It happened at least in 2 PSI clusters (mod-qe-1 and mod-qe-4) and I believe it also happened in a OpenShiftDedicated cluster we had last week

      Build Details:

      Additional info:

              mshah10 Maulik Shah (Inactive)
              rhn-support-jgarciao Jorge Garcia Oncins
              Jorge Garcia Oncins Jorge Garcia Oncins
              0 Vote for this issue
              6 Start watching this issue
