Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-23517

KubeJobFailed alert in openshift-insights fires often and long in techpreview E2E jobs

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • 4.15.0
    • Insights Operator
    • No
    • CCXDEV Sprint 107
    • 1
    • False
    • Hide

      None

      Show
      None

    Description

      We gather data on how long alerts fire in CI job runs and noticed today that this KubeJobFailed alert is firing for thousands of seconds often, perhaps all the time, on techpreview job runs, serial and non. These jobs often look relatively healthy.

      Examples:

      https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.15-e2e-gcp-sdn-techpreview-serial/1726794097219342336

      • in this run it appears it was periodic-gathering-gjxd5
      • Logs from that pod are here

      https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.15-e2e-vsphere-ovn-techpreview-serial/1722375928086007808

      More can be found here. (scroll down to the job run list)

      Expanding the intervals chart in the spyglass view shows a chart that will include when these alerts fired, and what else was going on when they did.

      It's not every job run, but it does appear somewhere between the 75th and 95th percentile, just by eyeballing it looks like it appears maybe 10-20% of the time.

      Could use assistance in determining why this job is marked failed, the logs don't look particularly alarming, and what if anything can be done about it.

      Attachments

        Activity

          People

            tremes1@redhat.com Tomas Remes
            rhn-engineering-dgoodwin Devan Goodwin
            Joao Bastos Fula Joao Bastos Fula
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: