Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Undefined
Fix Version/s: None
Affects Version/s: None
Component/s: Log Collection
Labels:
- devel_ack?

Activity Type:
Incidents & Support
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Docs QE Status:
NEW
QE Status:
NEW
Release Note Type:
Bug Fix

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Description of problem:

This issue similar to ~~LOG-7502~~, but the previous fix did not cover all cases. When the TCP session to the syslog server is killed, log forwarding stops for about 15 minutes. This happens at least in configurations where the syslog server uses active/standby nodes behind a load balancer. The issue is that detection of the broken connection and recovery take a long time, which makes monitoring and troubleshooting difficult.

Version-Release number of selected component (if applicable):

All latest Logging versions using Vector and syslog output (socket sink using TCP)

How reproducible:

Always

Steps to Reproduce:

Configure a syslog server with two nodes in active/standby mode
Create a Kubernetes Service or external load balancer to route traffic to the syslog servers
Configure ClusterLogForwarder to send logs to an external syslog server using TCP
Deploy an application that generates logs every second
Confirm logs are forwarded to the syslog server
Kill the TCP session by shutting down the active syslog server node

Actual results:

Log forwarding stops after the TCP session is killed.
Recovery happens only after OS TCP timeout (~15 minutes) or collector pod restart.

Expected results:

Vector should detect the broken TCP session quickly and reconnect to the syslog server.
Log forwarding should resume automatically without manual pod restart.

Additional info:

We also tested with a Kubernetes Service acting as a load balancer for syslog servers in active/standby configuration, and the same issue occurred.
In cases with an active connection, the option introduced in ~~LOG-7502~~ (keepalive.time_secs) did not help.

Assignee:: Unassigned

Reporter:: KATSUYA KAWAKAMI

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2025/12/23 4:28 PM

Updated:: 2026/01/09 9:22 PM

Details

Description

Description of problem:

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:

Actual results:

Expected results:

Additional info:

Attachments

Easy Agile Planning Poker

Activity

People

Dates