-
Story
-
Resolution: Won't Do
-
Undefined
-
None
-
2.3.0.GA
-
None
-
False
-
None
-
False
-
-
- Add automatic useless record deletion feature instead of retention and size
- Providing the ability to periodically delete useless records consumed by all consumer groups (or specific consumer groups).
- And as an extension, in order to keep them for a while after they become useless records, we could create our own retention-like configuration, different from Kafka's.
- Motivation
- Basically, Kafka manages the deletion of records based on retention and size when to use log.cleanup.policy=delete, so many users define their retention in advance, assuming message consumption and failure recovery time.
- However, it is possible for the load and recovery time to exceed such assumptions, in which case the message would be lost. Users may never want to delete records until all consumer groups have consumed them, So a clear definition of retention may not be appropriate for all users.
- In use cases where the messages consumed by all consumer groups are no longer needed, disk resources are wasted until the retention.
- The Kafka client has AdminClient.deleteRecords[1], AdminClient.listConsumerGroups[2] and AdminClient.listOffsets with OffsetSpec.forTimestamp[3]. So it is technically possible to get offsets for consumer groups or specific time offsets, then delete records prior to that offsets.
[1] AdminClient.deleteRecords
https://kafka.apache.org/32/javadoc/org/apache/kafka/clients/admin/KafkaAdminClient.html#deleteRecords(java.util.Map,org.apache.kafka.clients.admin.DeleteRecordsOptions)
[2] AdminClient.listConsumerGroups
https://kafka.apache.org/32/javadoc/org/apache/kafka/clients/admin/KafkaAdminClient.html#listConsumerGroups(org.apache.kafka.clients.admin.ListConsumerGroupsOptions)
[3] AdminClient.listOffsets
https://kafka.apache.org/32/javadoc/org/apache/kafka/clients/admin/KafkaAdminClient.html#listOffsets(java.util.Map,org.apache.kafka.clients.admin.ListOffsetsOptions)