Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Critical
Fix Version/s: None
Affects Version/s: None
Component/s: QE
Labels:
None

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Git Pull Request:
https://gitlab.cee.redhat.com/istio/servicemesh-qe/jenkins-csb-declaration/-/merge_requests/665, https://gitlab.cee.redhat.com/istio/servicemesh-qe/jenkins-csb-declaration/-/commit/f291836cc32872fdac2a202c905fb51dc9f57468, https://gitlab.cee.redhat.com/istio/servicemesh-qe/jenkins-csb-declaration/-/merge_requests/735
Intelligence Requested:
Market:

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Catalogsources never gets into runnign state on hive RHOS-01 clusters.

( 0/1 CrashLoopBackOff )
The pod is able to pull the image (iib from cvp pipeline, e.g. registry-proxy.engineering.redhat.com/rh-osbs/iib:951734) but it never gets into 1/1 Running state.
The log contains only

time="2025-04-08T14:34:26Z" level=info msg="starting pprof endpoint" address="localhost:6060"
time="2025-04-08T14:34:26Z" level=info msg="found existing cache contents" backend=pogreb.v1 cache=/tmp/cache configs=/configs

and in the pod events is only this event

Startup probe failed: timeout: failed to connect service ":50051" within 1s

The other catalogsources (re.g. certified-operators ) are running correctly.

Currently, all our OCP clusters deployed (via hive) in the internal RHOS-01 PSI infrastructure are experiencing this issue (there are plenty free resources on nodes). The same clusters on RHOS-D don't have this issue so I suspect that root cause is the infrastructure itself.

mentioned on

Merge request - Update catalog source resource

Assignee:: Matej Kralik

Reporter:: Matej Kralik

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2025/04/09 8:27 AM

Updated:: 2025/07/11 6:39 AM

Resolved:: 2025/07/11 6:39 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates