-
Bug
-
Resolution: Done
-
Critical
-
None
-
None
-
None
-
False
-
-
False
-
https://gitlab.cee.redhat.com/istio/servicemesh-qe/jenkins-csb-declaration/-/merge_requests/665, https://gitlab.cee.redhat.com/istio/servicemesh-qe/jenkins-csb-declaration/-/commit/f291836cc32872fdac2a202c905fb51dc9f57468, https://gitlab.cee.redhat.com/istio/servicemesh-qe/jenkins-csb-declaration/-/merge_requests/735
-
-
Catalogsources never gets into runnign state on hive RHOS-01 clusters.
( 0/1 CrashLoopBackOff )
The pod is able to pull the image (iib from cvp pipeline, e.g. registry-proxy.engineering.redhat.com/rh-osbs/iib:951734) but it never gets into 1/1 Running state.
The log contains only
time="2025-04-08T14:34:26Z" level=info msg="starting pprof endpoint" address="localhost:6060" time="2025-04-08T14:34:26Z" level=info msg="found existing cache contents" backend=pogreb.v1 cache=/tmp/cache configs=/configs
and in the pod events is only this event
Startup probe failed: timeout: failed to connect service ":50051" within 1s
The other catalogsources (re.g. certified-operators ) are running correctly.
Currently, all our OCP clusters deployed (via hive) in the internal RHOS-01 PSI infrastructure are experiencing this issue (there are plenty free resources on nodes). The same clusters on RHOS-D don't have this issue so I suspect that root cause is the infrastructure itself.
- mentioned on