- Proposed title of this feature request
Please modify the default value of the `backoffLimit` for the image-pruner job from 0 to 6
- What is the nature and description of the request?
Since the backoffLimit for the image-pruner job is set to 0, no retries are performed in case of an error.
It is still set to 0 in the latest OpenShift 4.17.
~~~
$ oc -n openshift-image-registry get cronjob image-pruner -o json | jq -r '.spec.jobTemplate.spec' | grep backoffLimit
"backoffLimit": 0,
~~~
On the other hand, the default value of the backoffLimit is described as "6".
~~~
$ oc explain Cronjob.spec.jobTemplate.spec.backoffLimit
GROUP: batch
KIND: CronJob
VERSION: v1
FIELD: backoffLimit <integer>
DESCRIPTION:
Specifies the number of retries before marking this job failed. Defaults to
6
~~~
It should be modified from 0 to 6 in order to allow the pod to retry when an error occurs.
- Why does the customer need this? (List the business requirements here)
Our customer had a job failure due to network errors caused by temporary load within the cluster is anticipated.
Currently, since the backoffLimit is set to 0, retries are not being performed.
The image-pruner cronjob is scheduled to run once a day, meaning the next execution is set for 24 hours later.
This results in the image-registry co remaining in a DEGRADED state due to the cronjob error until that time.
- List any affected packages or components
- openshift-image-registry
Describe the impact to you or the business
The image-registry co is remaining in a DEGRADED state until the next job execution when the cronjob is failed.
Additionally, we advise customers to attempt a retry if the OpenShift REST API returns an error.
OpenShift itself should be configured to perform retries in case of an error.
|