Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-4488

Failed retriable operations are retried infinitely

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 2.2.0.CR1
    • 1.8.0.Final
    • core-library
    • None
    • False
    • False
    • Hide
      1. Start the pipeline from the SQL Server connector tutorial.
      2. Stop SQL Server:
        $ docker compose -f docker-compose-sqlserver.yaml stop sqlserver

         
        At this point, the connector will infinitely retry to start:

      1. An exception occurred in the change event producer. This connector will be restarted. Awaiting end of restart backoff period after a retriable error
      2. Awaiting end of restart backoff period after a retriable error
      3. Starting SqlServerConnectorTask with configuration
      4. GO TO 1

      During this loop, from the Kafka Connect perspective, the connector is still running:

      $ curl -s [http://localhost:8083/connectors/inventory-connector/tasks/0/status] | jq
      {
          "id": 0,
          "state": "RUNNING",
          "worker_id": "[172.26.0.5:8083|http://172.26.0.5:8083/]"
      }
      
      Show
      Start the pipeline from the SQL Server connector  tutorial . Stop SQL Server: $ docker compose -f docker-compose-sqlserver.yaml stop sqlserver   At this point, the connector will infinitely retry to start: An exception occurred in the change event producer. This connector will be restarted. Awaiting end of restart backoff period after a retriable error Awaiting end of restart backoff period after a retriable error Starting SqlServerConnectorTask with configuration GO TO 1 During this loop, from the Kafka Connect perspective, the connector is still running: $ curl -s [http://localhost:8083/connectors/inventory-connector/tasks/0/status] | jq {   "id": 0,   "state": "RUNNING",   "worker_id": "[172.26.0.5:8083|http://172.26.0.5:8083/]" }

      The Debezium core framework provides an API for declaring certain exceptions as retriable and implements the retry logic. However, there is no way to limit the number or the duration of retry attempts.
       
      During the retry loop, the connector remains running from the Kafka Connect API perspective which makes it challenging to identify an infinite retry loop. If it takes too long, the CDC data on the server may expire which will lead to data loss.

              Unassigned Unassigned
              sergeimorozov Sergei Morozov (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: