-
Epic
-
Resolution: Done
-
Major
-
None
-
None
-
Reduce time taken for followers to catch-up
-
False
-
None
-
False
-
No
-
To Do
-
MGDSRVS-48 - Be able to sustain an external paying customer in production
-
0% To Do, 0% In Progress, 100% Done
-
---
-
---
WHAT
Kafka has the notion of leader and follower brokers. In Kafka, ** for each topic partition, one broker is the leader and (as replication factor is 3), another two brokers are followers. The leader broker uses resources (threads) to replicate data to the followers. If the number of threads is inadequate, there will be queuing within the leader and the replicas may become out of sync. This will affect the ability of the customer to ingress messages into their kafka instance.
Currently RHOSAK uses the default num.replica.fetchers threads (1). This might be a bottleneck for some use-cases.
It's also good to check what other service provider set this config. For example, MSK increased this number to 2 here.
WHY
Improve the followers fetch performance
HOW
1. Run benchmark tests to find out the best configuration for RHOSAK. One suggestion is to create large partitions with 1 replication factor, and feed some data. After that, increase the replication factor, and check how long will all the new added replicas catch-up with leaders.
Note: While increasing the replication factor, please keep producing/consuming records to/from brokers, to mimic real production environment.
2. Update the service to use the new number of threads.
1.
|
Replication performance test | Closed | Sam Barker |