-
Task
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
None
-
False
-
None
-
False
-
-
In Scale lab environment, several errors for "reason: 'entry too far behind' " from loki were observed in FLP logs. Probable cause of this occurring could be we're backed up in terms of flows processing as it accumulates in Kafka:
time=2022-11-17T00:26:03Z level=info component=client error=server returned HTTP status 429 Too Many Requests (429): entry with timestamp 2022-11-16 20:17:26.827000064 +0000 UTC ignored, reason: 'entry too far behind' for stream: \{DstK8S_Namespace="openshift-dns", DstK8S_OwnerName="dns-default", FlowDirection="1", app="netobserv-flowcollector"}, fields.level=warn fields.msg=error sending batch, will retry host=lokistack-distributor-http.openshift-operators-redhat.svc:3100 module=export/loki status=429
Couple of loki config that might be causing this:
max_chunk_age: 2h reject_old_samples_max_age: 168h
We're likely hitting max_chunk_age limit here: https://grafana.com/docs/loki/latest/configuration/#accept-out-of-order-writes , defaults to 2h (not configurable via CRD) and computed as time_of_most_recent_line - (max_chunk_age/2)
Reference to discussion: https://coreos.slack.com/archives/C02939DP5L5/p1668654989194159
- is caused by
-
NETOBSERV-483 Do performance testing for large cluster
- Closed