Loading...

XML

Word

Printable

Type: Bug
Resolution: Not a Bug
Priority: Minor
Fix Version/s: None
Affects Version/s: 4.15.0
Component/s: Insights Operator
Labels:
- trt

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
No

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
CCXDEV Sprint 107
sprint_count:
1

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

We gather data on how long alerts fire in CI job runs and noticed today that this KubeJobFailed alert is firing for thousands of seconds often, perhaps all the time, on techpreview job runs, serial and non. These jobs often look relatively healthy.

Examples:

https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.15-e2e-gcp-sdn-techpreview-serial/1726794097219342336

in this run it appears it was periodic-gathering-gjxd5
Logs from that pod are here

https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.15-e2e-vsphere-ovn-techpreview-serial/1722375928086007808

More can be found here. (scroll down to the job run list)

Expanding the intervals chart in the spyglass view shows a chart that will include when these alerts fired, and what else was going on when they did.

It's not every job run, but it does appear somewhere between the 75th and 95th percentile, just by eyeballing it looks like it appears maybe 10-20% of the time.

Could use assistance in determining why this job is marked failed, the logs don't look particularly alarming, and what if anything can be done about it.

Assignee:: Tomas Remes

Reporter:: Devan Goodwin

QA Contact:: Joao Bastos Fula

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2023/11/21 5:32 PM

Updated:: 2025/07/24 11:35 PM

Resolved:: 2024/08/16 12:30 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates