Uploaded image for project: 'OpenShift SDN'
  1. OpenShift SDN
  2. SDN-3134

alert/PrometheusOperatorWatchErrors should not be at or above info

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Major Major
    • None
    • None
    • None
    • 0
    • 0

      job link

      must-gather

      snippet from test output:

      {  PrometheusOperatorWatchErrors was at or above info for at least 2m57s on platformidentification.JobType{Release:"4.11", FromRelease:"4.10", Platform:"aws", Architecture:"amd64", Network:"ovn", Topology:"ha"} (maxAllowed=0s): pending for 0s, firing for 2m57s:
      
      Jun 06 19:49:40.986 - 59s   W alert/PrometheusOperatorWatchErrors ns/openshift-monitoring ALERTS{alertname="PrometheusOperatorWatchErrors", alertstate="firing", controller="alertmanager", namespace="openshift-monitoring", prometheus="openshift-monitoring/k8s", severity="warning"}
      Jun 06 19:49:40.986 - 59s   W alert/PrometheusOperatorWatchErrors ns/openshift-monitoring ALERTS{alertname="PrometheusOperatorWatchErrors", alertstate="firing", controller="prometheus", namespace="openshift-monitoring", prometheus="openshift-monitoring/k8s", severity="warning"}
      Jun 06 19:49:40.986 - 59s   W alert/PrometheusOperatorWatchErrors ns/openshift-monitoring ALERTS{alertname="PrometheusOperatorWatchErrors", alertstate="firing", controller="thanos", namespace="openshift-monitoring", prometheus="openshift-monitoring/k8s", severity="warning"}}
      

      As of the creation of this bug, this is happening in the aws-ovn-upgrade job in 5% of all failed jobs. It's under 3% for
      the sdn jobs. Seems worth looking in to. The few jobs I checked seemed to show this alert is pending and/or firing
      at warning level at the very beginning of the e2e tests which is just after the cluster installation has completed. Maybe
      initial cluster install churn is causing this and softening the alert expression is an option?

      link to this job's testgrid for reference.

       

            Unassigned Unassigned
            jluhrsen Jamo Luhrsen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: