Uploaded image for project: 'AMQ Streams'
  1. AMQ Streams
  2. ENTMQST-4127

KafkaRoller: Post restart leader election barrier

XMLWordPrintable

      Why/What

      Since Kafka will move leader out to other in-sync replicas before shutting down, we can imagine, after broker restart or rolling restart, the leaders distribution will be imbalanced.

      Currently, Kafka has a config: auto.leader.rebalance.enable to auto rebalance the leaders in the cluster. But it relied on `leader.imbalance.per.broker.percentage`(default 10%) and will check on every `leader.imbalance.check.interval.seconds`(default 5 mins). That is, if after rolling upgrade or forcing restart by operator, kafka can't detect it immediately, or even detected, no action will be taken due to imbalance ratio is less than `leader.imbalance.per.broker.percentage`.

      Relying on 3rd party tool like Cruise Control is an alternative to fix this leader imbalance issue. But Cruise Control is not a required component for running Kafka.

      Therefore, we think proactively call admin API to rebalance leader after broker restart is a good solution since the kafka roller has the insight about when brokers restarted.

      How

      Calling ELECT_LEADERS API to force leader election for preferred leader.

      Done

      After kafka roller restart brokers, the leader distribution should be evenly distributed.

       

              Unassigned Unassigned
              tbentley-1 Tom Bentley
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated: