Loading...

XML

Word

Printable

Type: Bug
Resolution: Obsolete
Priority: Major
Fix Version/s: None
Affects Version/s: Logging 5.7.6
Component/s: Log Collection
Labels:
- devel_ack+
- fluentd

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Docs QE Status:
NEW
QE Status:
NEW
Release Note Type:
Bug Fix
Intelligence Requested:
Market:

Severity:
Important

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Description of problem:

It's observed continually error messages in fluentd as below:

openshift-logging/logging-loki-ingester-1[loki-ingester]: level=warn ts=2023-10-04T09:17:18.665393472Z caller=grpc_logging.go:43 method=/logproto.Pusher/Push duration=2.273379ms err="rpc error: code = Code(400) desc = entry with timestamp 2023-10-04 07:36:24.87037 +0000 UTC ignored, reason: 'entry too far behind, oldest acceptable timestamp is: 2023-10-04T08:17:16Z' for stream: {fluentd_thread=\"flush_thread_1\", kubernetes_host=\"woker-1.example.com\", log_type=\"audit\"},\nuser 'audit', total ignored: 1 out of 122" msg=gRPC

$ for pod in $(omc get pods -l component=collector -o name); do omc logs $pod -c collector ; done|grep -c "too far behind"
18085

It's not observed:

* problems on the node related to memory or cpu
* problems on the collector where hitting limits (not cpu limits)

not buffer files observed in fluentd indicating that fluentd is buffering the logs on its side for any delay delivering the logs to Loki or any other output defined
reviewed directly from the log files the audit logs and it's observed the logs generated with the current timestamp
all nodes are NTP synchronized and using the same timezone and time

As not buffer files on the buffer path to Loki, then, it's not known how the logs are and the real content.

Version-Release number of selected component (if applicable):

CLO 5.7.6
Fluentd
Loki

How reproducible:

Not able to reproduce

Steps to Reproduce:

Actual results:

Expected results:

Additional info:

It's needed any guidance in troubleshooting the reason for the "too far behind" errors in fluentd since until now, not able to find any problems as shared above in the node level or even, any constraints on cpu or memory or buffer files on the fluentd side
When deleting the audit pos files in one fluentd pod and restarting it, thinking that the problem could be related to fluentd reading old logs and using the old timestamp from the log to Loki and then, this rejecting the log. It was got also the same error for infrastructure logs

links to

[KCS] Too far behind messages when log forwarding to Loki in RHOCP 4

Assignee:: Unassigned

Reporter:: Oscar Casal Sanchez

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2023/10/23 3:22 PM

Updated:: 2025/09/12 8:24 PM

Resolved:: 2024/04/23 5:17 PM

Details

Description

Description of problem:

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:

Actual results:

Expected results:

Additional info:

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates