Uploaded image for project: 'AMQ Streams'
  1. AMQ Streams
  2. ENTMQST-1922

KafkaRoller blocks rolling update when topic with RF=1 and MIN-ISR=2 exists

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Critical Critical
    • 1.5.0.GA
    • 1.5.0.GA
    • None
    • None
    • Release Notes
    • +

      It looks like the KafkaRoller has a an issue with topics which have replication factor lower than the min.insync.replicas option. It will just wait until such topic can roll safely, but never really completes because such topic will be never safe to roll. Funny enough, it blocks the rolling update, so I cannot remove the min.insync.replicas from cluster config after I run into the issue.

      I run into this with following scenarion:

      • Have an auto-created topic with RF=1
      • Set the cluster wide option min.insync.replicas to 2
      • Trigger a rolling update with whatever reason
      2020-05-06 12:41:26 INFO  KafkaRoller:247 - Pod 0 needs to be restarted. Reason: Pod has old generation
      2020-05-06 12:41:26 INFO  KafkaAvailability:109 - my-topic2/0 is already underreplicated (|ISR|=1, min.insync.replicas=2); broker 0 has a replica, so should not be restarted right now (it might be first to catch up).
      2020-05-06 12:41:26 INFO  KafkaRoller:218 - Could not roll pod 0 due to io.strimzi.operator.cluster.operator.resource.KafkaRoller$UnforceableProblem: Pod my-cluster-kafka-0 is currently not rollable, retrying after at least 500ms
      2020-05-06 12:41:27 INFO  KafkaRoller:247 - Pod 0 needs to be restarted. Reason: Pod has old generation
      2020-05-06 12:41:27 INFO  KafkaAvailability:109 - my-topic2/0 is already underreplicated (|ISR|=1, min.insync.replicas=2); broker 0 has a replica, so should not be restarted right now (it might be first to catch up).
      2020-05-06 12:41:27 INFO  KafkaRoller:218 - Could not roll pod 0 due to io.strimzi.operator.cluster.operator.resource.KafkaRoller$UnforceableProblem: Pod my-cluster-kafka-0 is currently not rollable, retrying after at least 1000ms
      2020-05-06 12:41:28 INFO  KafkaRoller:247 - Pod 0 needs to be restarted. Reason: Pod has old generation
      2020-05-06 12:41:28 INFO  KafkaAvailability:109 - my-topic2/0 is already underreplicated (|ISR|=1, min.insync.replicas=2); broker 0 has a replica, so should not be restarted right now (it might be first to catch up).
      2020-05-06 12:41:28 INFO  KafkaRoller:218 - Could not roll pod 0 due to io.strimzi.operator.cluster.operator.resource.KafkaRoller$UnforceableProblem: Pod my-cluster-kafka-0 is currently not rollable, retrying after at least 2000ms
      2020-05-06 12:41:30 INFO  KafkaRoller:247 - Pod 0 needs to be restarted. Reason: Pod has old generation
      2020-05-06 12:41:31 INFO  KafkaAvailability:109 - my-topic2/0 is already underreplicated (|ISR|=1, min.insync.replicas=2); broker 0 has a replica, so should not be restarted right now (it might be first to catch up).
      2020-05-06 12:41:31 INFO  KafkaRoller:218 - Could not roll pod 0 due to io.strimzi.operator.cluster.operator.resource.KafkaRoller$UnforceableProblem: Pod my-cluster-kafka-0 is currently not rollable, retrying after at least 4000ms
      2020-05-06 12:41:35 INFO  KafkaRoller:247 - Pod 0 needs to be restarted. Reason: Pod has old generation
      2020-05-06 12:41:35 INFO  KafkaAvailability:109 - my-topic2/0 is already underreplicated (|ISR|=1, min.insync.replicas=2); broker 0 has a replica, so should not be restarted right now (it might be first to catch up).
      2020-05-06 12:41:35 INFO  KafkaRoller:218 - Could not roll pod 0 due to io.strimzi.operator.cluster.operator.resource.KafkaRoller$UnforceableProblem: Pod my-cluster-kafka-0 is currently not rollable, retrying after at least 8000ms
      2020-05-06 12:41:43 INFO  KafkaRoller:247 - Pod 0 needs to be restarted. Reason: Pod has old generation
      2020-05-06 12:41:43 INFO  KafkaAvailability:109 - my-topic2/0 is already underreplicated (|ISR|=1, min.insync.replicas=2); broker 0 has a replica, so should not be restarted right now (it might be first to catch up).
      2020-05-06 12:41:43 INFO  KafkaRoller:218 - Could not roll pod 0 due to io.strimzi.operator.cluster.operator.resource.KafkaRoller$UnforceableProblem: Pod my-cluster-kafka-0 is currently not rollable, retrying after at least 16000ms
      

      SInce it is obvious, that we cannot keep the topic available when RF <= MIN-ISR, the KafkaRoller should just roll the pod regardless of such topic.

      Upstream issue Strimzi#2964

            Unassigned Unassigned
            scholzj JAkub Scholz
            Lukas Kral Lukas Kral
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: