-
Bug
-
Resolution: Done
-
Undefined
-
None
-
4.19.0
-
Quality / Stability / Reliability
-
False
-
-
None
-
Critical
-
None
-
None
-
None
-
Proposed
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Three single node aggregated jobs have failed now with 5072 failures, caused by not enough results to aggregate. Roughly half the jobs fail to install because the image registry is complaining about:
{Operator degraded (ImagePrunerJobFailed): ImagePrunerDegraded: Job has reached the specified backoff limit Operator degraded (ImagePrunerJobFailed): ImagePrunerDegraded: Job has reached the specified backoff limit}
Examples:
Or an example of the sub-job with the failure:
The closest I was able to get to a root cause was here
I1212 00:08:31.988786 67 envvar.go:172] "Feature gate default state" feature="WatchListClient" enabled=false I1212 00:08:31.988805 67 envvar.go:172] "Feature gate default state" feature="InformerResourceVersion" enabled=false Error from server (Timeout): the server was unable to return a response in the time allotted, but may still be processing the request (get pods)
Marked critical as this is a payload blocker for 4.19 at this point.