Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-6404

The connector get stuck when K8s pods are reshuffling

    XMLWordPrintable

Details

    • Bug
    • Resolution: Not a Bug
    • Major
    • None
    • 2.2.0.Final
    • mongodb-connector
    • None
    • False
    • None
    • False

    Description

      The connector can get stuck with the message "Couldn't commit processed log positions with the source database due to a concurrent connector shutdown or restart" and is not processing any changes.

      It was reproduced when deployed by Strimzi in K8s.

      How to reproduce 

      Reproducer #1 Delete Kafka Connect pod

      1. Kafka Connect has 1 replica and Debeizum connector is running.
      2. Delete the pod.
      3. New pod is created and sometimes the Debezium connector get stuck there.

      Reproducer #2 Scale up the number of Kafka Connect pods

      1. Kafka Connect has 1 replica and Debeizum connector is running.
      2. Scale Kafka connect to 2 replicas
      3. New pod is created and sometimes the Debezium connector get stuck there.

       

      Few logs before the stuck:

      2023-04-29 09:10:02,010 INFO WorkerSourceTask
      {id=mongodb-access-apps-0} Source task finished initialization and start (org.apache.kafka.connect.runtime.WorkerSourceTask) [task-thread-mongodb-access-apps-0[]
      2023-04-29 09:10:02,010 INFO Attempting to start task (io.debezium.connector.common.BaseSourceTask) [task-thread-mongodb-access-apps-0[]
      2023-04-29 09:10:02,025 INFO Loading the custom topic naming strategy plugin: io.debezium.schema.DefaultTopicNamingStrategy (io.debezium.config.CommonConnectorConfig) [task-thread-mongodb-access-apps-0[]
      2023-04-29 09:10:02,125 INFO Starting MongoDB connector and discovering replica set(s) at mongodb://foo:27017/?replicaSet=rs27017 (io.debezium.connector.mongodb.MongoDbConnector) [connector-thread-mongodb-access-apps[]
      2023-04-29 09:10:02,126 INFO Requested thread factory for connector MongoDbConnector, id = mongodb-access-apps-2 named = replica-set-monitor (io.debezium.util.Threads) [connector-thread-mongodb-access-apps[]
      2023-04-29 09:10:02,131 INFO Creating thread debezium-mongodbconnector-mongodb-access-apps-2-replica-set-monitor (io.debezium.util.Threads) [connector-thread-mongodb-access-apps[]
      2023-04-29 09:10:02,132 INFO Successfully started MongoDB connector, and continuing to discover changes in replica set(s) at mongodb://foo:27017/?replicaSet=rs27017 (io.debezium.connector.mongodb.MongoDbConnector) [connector-thread-mongodb-access-apps[]
      2023-04-29 09:10:02,209 INFO Retrieving configuration from Secret mongo-debezium in namespace jkaspar (io.strimzi.kafka.AbstractKubernetesConfigProvider) [DistributedHerder-connect-1-1[]
      2023-04-29 09:10:02,217 INFO SourceConnectorConfig values: 
      config.action.reload = restart
      connector.class = io.debezium.connector.mongodb.MongoDbConnector
      errors.log.enable = true
      errors.log.include.messages = true
      errors.retry.delay.max.ms = 60000
      errors.retry.timeout = 0
      errors.tolerance = all
      header.converter = null
      key.converter = class org.apache.kafka.connect.storage.StringConverter
      name = mongodb-access-apps
      predicates = []
      tasks.max = 1
      topic.creation.groups = []
      transforms = [unwrap, valueToKey, composeKey, extractKey, renameId, removeField, setTopic[]
      value.converter = class org.apache.kafka.connect.json.JsonConverter
      (org.apache.kafka.connect.runtime.SourceConnectorConfig) [DistributedHerder-connect-1-1[]
      2023-04-29 09:11:01,914 INFO WorkerSourceTask{id=mongodb-access-apps-0}
      flushing 0 outstanding messages for offset commit (org.apache.kafka.connect.runtime.WorkerSourceTask) [SourceTaskOffsetCommitter-1[]
      2023-04-29 09:11:01,914 WARN Couldn't commit processed log positions with the source database due to a concurrent connector shutdown or restart (io.debezium.connector.common.BaseSourceTask) [SourceTaskOffsetCommitter-1[]
      2023-04-29 09:11:02,149 INFO [Worker clientId=connect-1, groupId=connect-cluster] Member connect-1-9cd62683-6367-4ea6-837b-212c452d1483 sending LeaveGroup request to coordinator core-kafka-0.core-kafka-brokers.jkaspar.svc:9092 (id: 2147483647 rack: null) due to consumer poll timeout has expired. This means the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time processing messages. You can address this either by increasing max.poll.interval.ms or by reducing the maximum size of batches returned in poll() with max.poll.records. (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [kafka-coordinator-heartbeat-thread | connect-cluster]
      2023-04-29 09:12:01,915 INFO WorkerSourceTask
      {id=mongodb-access-apps-0} flushing 0 outstanding messages for offset commit (org.apache.kafka.connect.runtime.WorkerSourceTask) [SourceTaskOffsetCommitter-1[]
      2023-04-29 09:12:01,915 WARN Couldn't commit processed log positions with the source database due to a concurrent connector shutdown or restart (io.debezium.connector.common.BaseSourceTask) [SourceTaskOffsetCommitter-1[]
      2023-04-29 09:13:01,916 INFO WorkerSourceTask{id=mongodb-access-apps-0}
      flushing 0 outstanding messages for offset commit (org.apache.kafka.connect.runtime.WorkerSourceTask) [SourceTaskOffsetCommitter-1[]
      2023-04-29 09:13:01,916 WARN Couldn't commit processed log positions with the source database due to a concurrent connector shutdown or restart (io.debezium.connector.common.BaseSourceTask) [SourceTaskOffsetCommitter-1[] 

      Attachments

        Activity

          People

            Unassigned Unassigned
            jaroslav.kaspar@jamf.com Jaroslav Kaspar
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: