Loading...

XML

Word

Printable

Type: Story
Resolution: Done
Priority: Undefined
Fix Version/s: None
Affects Version/s: None
Labels:
None

Blocked:
False
Blocked Reason:
None
Ready:
False

SFDC Cases Links:
SFDC Cases Counter:

The related Jira is ~~TRT-529~~ and this slack thread for context.

We will target the openshift-config-operator pods at first since that one is quite common.

In this chart, we see the symptom we're trying to track for openshift-config-operator. Note the reason/ReadinessFailed events with "Client.Timeout exceeded".

In this log, we see:

E0829 12:50:46.153659       1 timeout.go:141] post-timeout activity - time-elapsed: 401.468µs, GET "/healthz" result: <nil>
E0829 12:55:55.138175       1 timeout.go:141] post-timeout activity - time-elapsed: 36.423581ms, GET "/healthz" result: <nil>
E0829 12:57:04.484155       1 timeout.go:141] post-timeout activity - time-elapsed: 301.763968ms, GET "/healthz" result: <nil>
E0829 12:58:13.315812       1 timeout.go:141] post-timeout activity - time-elapsed: 60.883527ms, GET "/healthz" result: <nil>
E0829 13:00:31.233383       1 timeout.go:141] post-timeout activity - time-elapsed: 134.115856ms, GET "/healthz" result: <nil>
E0829 13:02:49.168533       1 timeout.go:141] post-timeout activity - time-elapsed: 974.408µs, GET "/healthz" result: <nil>
E0829 13:02:49.474493       1 timeout.go:141] post-timeout activity - time-elapsed: 305.37752ms, GET "/healthz" result: <nil>

You can see there is a lot of latency related to the probe replies.

The point of the test is to identify how frequent the problem is happening and on what jobs.

After this, we can take next steps as mentioned in the slack thread mentioned above including trying to understand why the /healthz probe is taking so long.

links to

openshift/origin#27436: Add Invariant test to look for probe errors for openshift-config-operator

Assignee:: Dennis Periquet

Reporter:: Dennis Periquet

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2022/09/22 11:29 AM

Updated:: 2022/10/07 1:02 PM

Resolved:: 2022/10/07 1:02 PM

Details

Description

Attachments

Issue Links

Activity

People

Dates