Uploaded image for project: 'OpenShift SDN'
  1. OpenShift SDN
  2. SDN-3031

failure in test case "[bz-monitoring][invariant] alert/Watchdog must have no gaps or changes"

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Major
    • None
    • None
    • None
    • SDN Sprint 218, SDN Sprint 219
    • 0
    • 0

    Description

      job link
      and two other's that happened recently:
      1
      2

      must-gather

      snippet from test output:

      {  Watchdog alert had 29619 changes during the run, which may be a sign of a Prometheus outage in violation of the prometheus query SLO of 100% uptime
      
      May 05 03:51:24.252 - 4255s E alert/Watchdog ns/openshift-monitoring ALERTS{alertname="Watchdog", alertstate="firing", namespace="openshift-monitoring", prometheus="openshift-monitoring/k8s", severity="none"}
      May 05 05:04:20.252 - 928s  E alert/Watchdog ns/openshift-monitoring ALERTS{alertname="Watchdog", alertstate="firing", namespace="openshift-monitoring", prometheus="openshift-monitoring/k8s", severity="none"}}
      

      This looks like something new that started in the last day or two (since 5/4) and happening in ~30% of the jobs (rough guess from looking at testgrid).

      link to this job's testgrid for reference.

       

      Attachments

        Activity

          People

            mkennell@redhat.com Martin Kennelly
            jluhrsen Jamo Luhrsen
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: