Uploaded image for project: 'OpenShift SDN'
  1. OpenShift SDN
  2. SDN-3031

failure in test case "[bz-monitoring][invariant] alert/Watchdog must have no gaps or changes"

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • None
    • None
    • None
    • SDN Sprint 218, SDN Sprint 219
    • 0
    • 0

      job link
      and two other's that happened recently:
      1
      2

      must-gather

      snippet from test output:

      {  Watchdog alert had 29619 changes during the run, which may be a sign of a Prometheus outage in violation of the prometheus query SLO of 100% uptime
      
      May 05 03:51:24.252 - 4255s E alert/Watchdog ns/openshift-monitoring ALERTS{alertname="Watchdog", alertstate="firing", namespace="openshift-monitoring", prometheus="openshift-monitoring/k8s", severity="none"}
      May 05 05:04:20.252 - 928s  E alert/Watchdog ns/openshift-monitoring ALERTS{alertname="Watchdog", alertstate="firing", namespace="openshift-monitoring", prometheus="openshift-monitoring/k8s", severity="none"}}
      

      This looks like something new that started in the last day or two (since 5/4) and happening in ~30% of the jobs (rough guess from looking at testgrid).

      link to this job's testgrid for reference.

       

              mkennell@redhat.com Martin Kennelly
              jluhrsen Jamo Luhrsen
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: