-
Bug
-
Resolution: Unresolved
-
Major
-
Logging 5.9.z, Logging 6.1.z, Logging 6.3.z
-
Incidents & Support
-
False
-
-
False
-
NEW
-
NEW
-
-
Bug Fix
-
-
-
Logging - Sprint 278
-
Important
Description of problem:
When it's configured an Kafka output and the message is larger than 1MB and configured the "spec.tuning.outputs.maxWrite" option to allow to write a batch longer than 1MB like:
- kafka:
url: tcp://kafka.openshift-logging.svc.cluster.local:9092/clo-topic
authentication:
sasl:
mechanism: PLAIN
password:
key: password
secretName: kafka-vector
username:
key: username
secretName: kafka-vector
tuning:
maxWrite: 6M
It's observed the error:
2025-08-25T22:52:12.077585Z ERROR sink{component_kind="sink" component_id=output_kafka_app component_type=kafka}: vector_common::internal_event::service: Service call failed. No retries or retries exhausted. error=Some(KafkaError (Message production error: MessageSizeTooLarge (Broker: Message size too large))) request_id=4 error_type="request_failed" stage="sending" internal_log_rate_limit=true
*Notes*:
1.The kafka server accepts messages longer than 1MB
2.If it's used Fluentd (Logging v5) instead of Vector, it works and the message arrive to Kafka
Version-Release number of selected component (if applicable):
Tested the issue in Logging 5.9.z, 6.1.8, 6.3.z
How reproducible:
Always
Steps to Reproduce:
- Deploy a Kafka server that accepts messages up 10MB
- Configure the "clusterLogForwarder" to log forward to the Kafka server and configure the "spec.output.tuning.maxWrite" up 6MB similar to:
[...] spec: outputs: - kafka: url: tcp://kafka.openshift-logging.svc.cluster.local:9092/clo-topic authentication: sasl: mechanism: PLAIN password: key: password secretName: kafka-vector username: key: username secretName: kafka-vector tuning: maxWrite: 6M name: kafka-app type: kafka pipelines: - inputRefs: - application name: test-app outputRefs: - kafka-app
- Create an application that generates a log up to 2MB
$ oc new-project test-kafka $ kubectl create deployment hello-node --image=registry.k8s.io/e2e-test-images/agnhost:2.43 -- /agnhost serve-hostname $ oc -n test-kafka rsh $(oc get pods -n test-kafka -l app=hello-node -o name) ~ $ for i in $(seq 1 $((2 * 1024 * 1024 / 5))); do echo -n 'hello'; done > /tmp/2mb_hello_line.txt ~ $ du -khs /tmp/2mb_hello_line.txt 2.0M /tmp/2mb_hello_line.txt ~ $ cat /tmp/2mb_hello_line.txt > /proc/1/fd/1
Actual results:
The collector pod running in the same node that the pod that generates logs up 2MB generates the error under this line and logs don't arrive to Kafka.
2025-08-25T22:52:12.077585Z ERROR sink{component_kind="sink" component_id=output_kafka_app component_type=kafka}: vector_common::internal_event::service: Service call failed. No retries or retries exhausted. error=Some(KafkaError (Message production error: MessageSizeTooLarge (Broker: Message size too large))) request_id=4 error_type="request_failed" stage="sending" internal_log_rate_limit=true
If it's used Logging v5 with Fluentd, logs are received in Kafka showing that the Kafka server accepts messages bigger than 1MB.
Note: The tuning option "maxWrite" [0] is translated to "batch.max_bytes" [1] and this option is passed to "librdkafka_options.batch.size". If instead of using "maxWrite" is moved to "Unmanaged" the ClusterLogForwarder and set by hand "librdkafka_options."batch.size" = "3000000"", it continues failing.
Expected results:
Kafka sink honour the tuning option "maxWrite" and allows to log forward logs until the size set bigger than 1MB.