Loading...

XML

Word

Printable

Type: Bug
Resolution: Not a Bug
Priority: Normal
Fix Version/s: None
Affects Version/s: 4.18.z, 4.19, 4.20
Component/s: apiserver-auth
Labels:
- Escalation
- rits-work

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
Moderate
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
None

Customer Impact:

Customer Escalated

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Review Complete:
PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

The kube-rbac-proxy-web running in the "alert-manager" pod is showing errors:

$ oc logs alertmanager-main-0 -c kube-rbac-proxy-web |tail -n2

2025-09-25T19:48:18.538959699Z I0925 19:48:18.538885       1 log.go:245] http: TLS handshake error from <IP>:42930: write tcp <IP>:9095-><IP>:42930: write: connection reset by peer
2025-09-25T19:48:19.255619320Z I0925 19:48:19.255539       1 log.go:245] http: TLS handshake error from <IP>:49036: write tcp <IP>:9095-><IP>:49036: write: connection reset by peer

Both alert manager pods are having the same errors. The error is show for 2 ips only, and both are linked to the openshift ingress router pods.

Same as in the issue: https://issues.redhat.com/browse/OCPBUGS-5916

There is no other visible issue for this.

Version-Release number of selected component (if applicable):

OpenShift 4.18.21

How reproducible:

100% of the time

Steps to Reproduce:

    1. Spin up a 4.18.21 cluster
    2. Check AM's KRP web container.

Business Impact:

As a summary, we use Openshift clusters as a secure technology for Kubernetes cluster (the only one FS validated). This cluster embeds Prometheus and this component generates 300 errors per min. It is up to Red Hat to provide what we are missing with this error (Is Prometheus and resulting monitoring broken?) BTW, we cannot disable Prometheus. We cannot determine if it is blocking something, we might be missing something important, the cluster might be done and Prometheus could not trigger an error. This error is very noisy for log analysis and we could miss something else.

duplicates

OCPBUGS-63708 The kube-rbac-proxy container in openshift-monitoring pods report TLS handshake error

ASSIGNED

is duplicated by

OCPBUGS-76649 Authentication failure in kube-rbac-proxy-web: square/go-jose: error in cryptographic primitive

ASSIGNED

relates to

OCPBUGS-5916 The kube-rbac-proxy-federate container reporting TLS handshake error

Closed

OCPBUGS-32021 too many "write: connection reset by peer" logs in kube-rbac-proxy-web container logs

Closed

Assignee:: Filip Krepinsky

Reporter:: Vladislav Walek

Need Info From:: None

Contributors:: None

QA Contact:: Junqi Zhao

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 10 Start watching this issue

Created:: 2025/09/26 11:19 PM

Updated:: 2026/02/16 10:02 AM

Resolved:: 2025/12/05 3:36 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates