Description of problem:
All default catalogsources in 4.11 are built using file-based catalogsouce. Those catalogsources fail to deploy successfully in 4.11 OCP cluster. Multiple CI runs on nightly build have failed due to this reason.
The main culprit is the longer process time for YAML/JSON unmarshalling in the registry pod. The proposal to address this issue to add startupProbe to the registry pod. The startupProbe will check for grpc health before activating the liveness/readiness probe.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Delay an 4.11 OpenShift cluster
2. Check registry pods for default catalogsources such as redhat-operators
The pods fail due to liveness/readiness probe failure: openshift-marketplace pod/redhat-operators-h22ms node/ci-op-s04xckx3-de73b-7fxs4-master-1 - reason/Unhealthy Readiness probe failed: timeout: failed to connect service ":50051" within 1s
The registry pods for default catalogsources should be up and running.
See Slack thread for more information:
Note: This bug is for backporting process. The 4.10.z BZ is https://bugzilla.redhat.com/show_bug.cgi?id=2115874