Loading...

XML

Word

Printable

Type: Task
Resolution: Won't Do
Priority: Undefined
Fix Version/s: None
Affects Version/s: None
Labels:
None

Activity Type:
None
Blocked:
False
Blocked Reason:
None
Ready:
False
Epic Link:
None
Story Points:
None

Target Version:
None
Release Blocker:
None
Sprint:
None

Currently we see this issue:

Aug 28 00:02:20.755103 ip-10-0-131-145 hyperkube[1366]: I0828 00:02:20.755067 1366 prober.go:116] "Probe failed" probeType="Readiness" pod="openshift-etcd/etcd-quorum-guard-588ff9b55d-8lhb7" podUID=5b79def2-9e56-4c93-b8ab-1d04db0f552f containerName="guard" probeResult=failure output=""
then few seconds later
Aug 28 00:02:25.797258 ip-10-0-131-145 hyperkube[1366]: I0828 00:02:25.797231 1366 kubelet.go:2175] "SyncLoop (probe)" probe="readiness" status="ready" pod="openshift-etcd/etcd-quorum-guard-588ff9b55d-8lhb7"
Try to improve the clustermembercontroller sync loop for health status or just improve to not fail there on probe quard during install at least or scale. Instead of maybe operator status use metrics to track this.

Slack for more context https://coreos.slack.com/archives/C027U68LP/p1630506922034600

AC:

come up with a solution which approach we want to take and present in the team meeting
implement the proposed solution

links to

openshift/origin#26439: ETCD-234: pkg/synthetictests: add etcd quorum-gaurd duplicate events to known problems

Assignee:: Unassigned

Reporter:: Ljiljana Cosic (Inactive)

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2021/09/01 3:27 PM

Updated:: 2022/10/11 1:46 PM

Resolved:: 2022/10/11 1:46 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates