Loading...

XML

Word

Printable

Type: Bug
Resolution: Cannot Reproduce
Priority: Undefined
Fix Version/s: None
Affects Version/s: 4.15
Component/s: Etcd
Labels:
- pmr-ai

Activity Type:
Incidents & Support
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

A ROSA cluster running Konflux is unhealthy and inaccessible to SRE. We've managed to directly SSH into control-plane nodes to troubleshoot the issue, and it appears that etcd pods are routinely starting up, forming a quorum, then dying without a clear cause. As a result, the cluster is extremely unhealthy.

Version-Release number of selected component (if applicable):

4.15.36

How reproducible:

At the moment - very. Not clear how we can recreate this on a separate cluster

Steps to Reproduce:

    1.
    2.
    3.

Actual results:

Cluster is unresponsive, etcd cannot seem to hold a quorum after initially forming it

Expected results:

etcd holds quorum after forming it initially

Additional info:

Current theory is that excessive querying from customer workloads may be contributing, but we're still working to prove/disprove this (main workload is tekton, which is known to be extremely resource intensive, and cluster has had its control-plane repeatedly scaled to accommodate this)

Assignee:: Dean West

Reporter:: Trevor Nierman

Need Info From:: None

Contributors:: None

QA Contact:: Ge Liu

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2025/01/03 5:33 PM

Updated:: 2025/07/02 12:59 PM

Resolved:: 2025/05/17 3:11 AM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates