-
Bug
-
Resolution: Done
-
Major
-
1.3.0.GA
File Source Connector stops in the case of a large file
- It seems there is a bug in the end condition(1).
And as long as I see the source code(2), in the worst case, there is a possibility of using twice the memory of the file size for the buffer
- So OutOfMemoryError may occur in the case of a large file.
By the way, File Connector is not recommended for production use(3) in the document.
(1) kafka/connect/file/src/main/java/org/apache/kafka/connect/file/FileStreamSourceTask.java
https://github.com/apache/kafka/blob/3cdc78e6bb1f83973a14ce1550fe3874f7348b05/connect/file/src/main/java/org/apache/kafka/connect/file/FileStreamSourceTask.java#L130
(2) kafka/connect/file/src/main/java/org/apache/kafka/connect/file/FileStreamSourceTask.java
https://github.com/apache/kafka/blob/3cdc78e6bb1f83973a14ce1550fe3874f7348b05/connect/file/src/main/java/org/apache/kafka/connect/file/FileStreamSourceTask.java#L136-L137
(3) Confluent > Kafka Connect FileStream Connectors
https://docs.confluent.io/current/connect/filestream_connector.html#connect-filestreamconnector
The Kafka Connect FileStream Connector examples are intended to show how a simple connector runs for those first getting started with Kafka Connect as either a user or developer. It is not recommended for production use. Instead, we encourage users to use them to learn in a local environment. The examples include both a file source and a file sink to demonstrate an end-to-end data flow implemented through Kafka Connect.