Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-6236

PostgreSQL connector doesn't restart properly if database if not reachable

    XMLWordPrintable

Details

    Description

      In order to make your issue reports as actionable as possible, please provide the following information, depending on the issue type.

      Bug report

      For bug reports, provide this information, please:

      What Debezium connector do you use and what version?

      debezium-connector-postgresql 2.1.3

      What is the connector configuration?

       

      "connector.class": "io.debezium.connector.postgresql.PostgresConnector""transforms.outbox.type": "io.debezium.transforms.outbox.EventRouter""slot.name": "outbox_business""tasks.max": "1""schema.history.internal.consumer.sasl.jaas.config": "${file:/opt/kafka-sasl/sasl.jaas.secret:config}""transforms": "outbox,flattenheaders""schema.include.list": "business""producer.override.security.protocol": "SASL_SSL""tombstones.on.delete": "false""topic.prefix": "business""transforms.outbox.table.expand.json.payload": "true""schema.history.internal.producer.security.protocol": "SASL_SSL""transforms.outbox.route.topic.replacement": "${routedByValue}""transforms.flattenheaders.type": "com.server.kafka.connect.transforms.FlattenHeader""schema.history.internal.producer.sasl.mechanism": "SCRAM-SHA-512""publication.autocreate.mode": "disabled""database.dbname": "business""database.user": "${file:/opt/business-dbz/business.secret:user}""producer.override.bootstrap.servers": "kafka-server""time.precision.mode": "adaptive""heartbeat.interval.ms": "3600000""schema.history.internal.kafka.bootstrap.servers": "kafka-server""database.port": "5432""plugin.name": "pgoutput""database.hostname": "business.server""database.password": "******************************************************************""name": "postgresql-outbox-business-prod""producer.override.sasl.jaas.config": "${file:/opt/kafka-sasl/sasl.jaas.secret:config}""schema.history.internal.consumer.sasl.mechanism": "SCRAM-SHA-512""schema.history.internal.producer.sasl.jaas.config": "${file:/opt/kafka-sasl/sasl.jaas.secret:config}""transforms.flattenheaders.key": "extraheaders""heartbeat-action-query": "INSERT business.debezium_heartbeat(dummy) VALUES (0);""producer.override.sasl.mechanism": "SCRAM-SHA-512""table.include.list": "business.outbox_event""snapshot.mode": "never""schema.history.internal.consumer.security.protocol": "SASL_SSL"}

       

      What is the captured database version and mode of depoyment?

      postgresql 12.10.0 in kubernetes

      What behaviour do you expect?

      When the database is restarted or unreachable for some time, the connector reports the connection lost.

      I expect the connector to retry to reach the database when it is available. As per https://debezium.io/documentation/faq/#what_happens_when_a_monitored_database_stops_or_crashes

       

      When a database server monitored by Debezium stops or crashes, the Debezium connector will likely try to re-establish communication 

      What behaviour do you see?

       

      Caused by: org.postgresql.util.PSQLException: Connection to business.server.business.svc.cluster.local:5432 refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.Caused by: java.net.ConnectException: Connection refused (Connection refused) 

      and connector doesn't retry anymore

      Do you see the same behaviour using the latest relesead Debezium version?

      Tested the latest released version and yes, same issue

      Do you have the connector logs, ideally from start till finish?

      (You might be asked later to provide DEBUG/TRACE level log)

      tutorial-postgres-1   | 2023-03-20 08:54:41.862 GMT [1] LOG:  received fast shutdown request
      tutorial-postgres-1   | 2023-03-20 08:54:41.865 GMT [1] LOG:  aborting any active transactions
      tutorial-postgres-1   | 2023-03-20 08:54:41.865 GMT [35] FATAL:  terminating connection due to administrator command
      tutorial-postgres-1   | 2023-03-20 08:54:41.867 GMT [1] LOG:  background worker "logical replication launcher" (PID 32) exited with exit code 1
      tutorial-postgres-1   | 2023-03-20 08:54:41.868 GMT [27] LOG:  shutting down
      tutorial-postgres-1   | 2023-03-20 08:54:42.041 GMT [27] LOG:  checkpoint starting: shutdown immediate
      tutorial-postgres-1   | 2023-03-20 08:54:42.064 GMT [27] LOG:  checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.009 s, sync=0.003 s, total=0.045 s; sync files=2, longest=0.002 s, average=0.002 s; distance=0 kB, estimate=0 kB
      tutorial-postgres-1   | 2023-03-20 08:54:42.069 GMT [1] LOG:  database system is shut down
      tutorial-postgres-1 exited with code 0
      tutorial-postgres-1 exited with code 0
      tutorial-connect-1    | 2023-03-20 08:54:42,521 ERROR  Postgres|dbserver1|streaming  Producer failure   [io.debezium.pipeline.ErrorHandler]
      tutorial-connect-1    | org.postgresql.util.PSQLException: FATAL: terminating connection due to administrator command
      tutorial-connect-1    |         at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2676)
      tutorial-connect-1    |         at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2366)
      tutorial-connect-1    |         at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:356)
      tutorial-connect-1    |         at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:496)
      tutorial-connect-1    |         at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:413)
      tutorial-connect-1    |         at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:190)
      tutorial-connect-1    |         at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:134)
      tutorial-connect-1    |         at io.debezium.jdbc.JdbcConnection.prepareQuery(JdbcConnection.java:693)
      tutorial-connect-1    |         at io.debezium.connector.postgresql.PostgresStreamingChangeEventSource.probeConnectionIfNeeded(PostgresStreamingChangeEventSource.java:349)
      tutorial-connect-1    |         at io.debezium.connector.postgresql.PostgresStreamingChangeEventSource.searchWalPosition(PostgresStreamingChangeEventSource.java:342)
      tutorial-connect-1    |         at io.debezium.connector.postgresql.PostgresStreamingChangeEventSource.execute(PostgresStreamingChangeEventSource.java:161)
      tutorial-connect-1    |         at io.debezium.connector.postgresql.PostgresStreamingChangeEventSource.execute(PostgresStreamingChangeEventSource.java:41)
      tutorial-connect-1    |         at io.debezium.pipeline.ChangeEventSourceCoordinator.streamEvents(ChangeEventSourceCoordinator.java:174)
      tutorial-connect-1    |         at io.debezium.pipeline.ChangeEventSourceCoordinator.executeChangeEventSources(ChangeEventSourceCoordinator.java:141)
      tutorial-connect-1    |         at io.debezium.pipeline.ChangeEventSourceCoordinator.lambda$start$0(ChangeEventSourceCoordinator.java:109)
      tutorial-connect-1    |         at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
      tutorial-connect-1    |         at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
      tutorial-connect-1    |         at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
      tutorial-connect-1    |         at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
      tutorial-connect-1    |         at java.base/java.lang.Thread.run(Thread.java:829)
      tutorial-connect-1    | 2023-03-20 08:54:42,771 WARN   ||  Going to restart connector after 10 sec. after a retriable exception   [io.debezium.connector.common.BaseSourceTask]
      tutorial-connect-1    | 2023-03-20 08:54:51,017 WARN   ||  Couldn't commit processed log positions with the source database due to a concurrent connector shutdown or restart   [io.debezium.connector.common.BaseSourceTask]
      tutorial-connect-1    | 2023-03-20 08:54:52,531 INFO   Postgres|dbserver1|streaming  Finished streaming   [io.debezium.pipeline.ChangeEventSourceCoordinator]
      tutorial-connect-1    | 2023-03-20 08:54:52,531 INFO   Postgres|dbserver1|streaming  Connected metrics set to 'false'   [io.debezium.pipeline.ChangeEventSourceCoordinator]
      tutorial-connect-1    | 2023-03-20 08:54:52,532 INFO   ||  Connection gracefully closed   [io.debezium.jdbc.JdbcConnection]
      tutorial-connect-1    | 2023-03-20 08:54:52,534 INFO   ||  Connection gracefully closed   [io.debezium.jdbc.JdbcConnection]
      tutorial-connect-1    | 2023-03-20 08:54:52,534 WARN   ||  WorkerSourceTask{id=inventory-connector-0} failed to poll records from SourceTask. Will retry operation.   [org.apache.kafka.connect.runtime.AbstractWorkerSourceTask]
      tutorial-connect-1    | org.apache.kafka.connect.errors.RetriableException: An exception occurred in the change event producer. This connector will be restarted.
      tutorial-connect-1    |         at io.debezium.pipeline.ErrorHandler.setProducerThrowable(ErrorHandler.java:49)
      tutorial-connect-1    |         at io.debezium.connector.postgresql.PostgresStreamingChangeEventSource.execute(PostgresStreamingChangeEventSource.java:180)
      tutorial-connect-1    |         at io.debezium.connector.postgresql.PostgresStreamingChangeEventSource.execute(PostgresStreamingChangeEventSource.java:41)
      tutorial-connect-1    |         at io.debezium.pipeline.ChangeEventSourceCoordinator.streamEvents(ChangeEventSourceCoordinator.java:174)
      tutorial-connect-1    |         at io.debezium.pipeline.ChangeEventSourceCoordinator.executeChangeEventSources(ChangeEventSourceCoordinator.java:141)
      tutorial-connect-1    |         at io.debezium.pipeline.ChangeEventSourceCoordinator.lambda$start$0(ChangeEventSourceCoordinator.java:109)
      tutorial-connect-1    |         at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
      tutorial-connect-1    |         at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
      tutorial-connect-1    |         at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
      tutorial-connect-1    |         at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
      tutorial-connect-1    |         at java.base/java.lang.Thread.run(Thread.java:829)
      tutorial-connect-1    | Caused by: org.postgresql.util.PSQLException: FATAL: terminating connection due to administrator command
      tutorial-connect-1    |         at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2676)
      tutorial-connect-1    |         at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2366)
      tutorial-connect-1    |         at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:356)
      tutorial-connect-1    |         at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:496)
      tutorial-connect-1    |         at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:413)
      tutorial-connect-1    |         at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:190)
      tutorial-connect-1    |         at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:134)
      tutorial-connect-1    |         at io.debezium.jdbc.JdbcConnection.prepareQuery(JdbcConnection.java:693)
      tutorial-connect-1    |         at io.debezium.connector.postgresql.PostgresStreamingChangeEventSource.probeConnectionIfNeeded(PostgresStreamingChangeEventSource.java:349)
      tutorial-connect-1    |         at io.debezium.connector.postgresql.PostgresStreamingChangeEventSource.searchWalPosition(PostgresStreamingChangeEventSource.java:342)
      tutorial-connect-1    |         at io.debezium.connector.postgresql.PostgresStreamingChangeEventSource.execute(PostgresStreamingChangeEventSource.java:161)
      tutorial-connect-1    |         ... 9 more
      tutorial-connect-1    | 2023-03-20 08:54:52,534 INFO   ||  Awaiting end of restart backoff period after a retriable error   [io.debezium.connector.common.BaseSourceTask]
      tutorial-connect-1    | 2023-03-20 08:54:54,535 INFO   ||  Awaiting end of restart backoff period after a retriable error   [io.debezium.connector.common.BaseSourceTask]
      tutorial-connect-1    | 2023-03-20 08:54:56,535 INFO   ||  Awaiting end of restart backoff period after a retriable error   [io.debezium.connector.common.BaseSourceTask]
      tutorial-connect-1    | 2023-03-20 08:54:58,536 INFO   ||  Awaiting end of restart backoff period after a retriable error   [io.debezium.connector.common.BaseSourceTask]
      tutorial-connect-1    | 2023-03-20 08:55:00,536 INFO   ||  Awaiting end of restart backoff period after a retriable error   [io.debezium.connector.common.BaseSourceTask]
      tutorial-connect-1    | 2023-03-20 08:55:02,537 INFO   ||  Starting PostgresConnectorTask with configuration:   [io.debezium.connector.common.BaseSourceTask]
      tutorial-connect-1    | 2023-03-20 08:55:02,537 INFO   ||     connector.class = io.debezium.connector.postgresql.PostgresConnector   [io.debezium.connector.common.BaseSourceTask]
      tutorial-connect-1    | 2023-03-20 08:55:02,537 INFO   ||     database.user = postgres   [io.debezium.connector.common.BaseSourceTask]
      tutorial-connect-1    | 2023-03-20 08:55:02,537 INFO   ||     database.dbname = postgres   [io.debezium.connector.common.BaseSourceTask]
      tutorial-connect-1    | 2023-03-20 08:55:02,537 INFO   ||     topic.prefix = dbserver1   [io.debezium.connector.common.BaseSourceTask]
      tutorial-connect-1    | 2023-03-20 08:55:02,537 INFO   ||     task.class = io.debezium.connector.postgresql.PostgresConnectorTask   [io.debezium.connector.common.BaseSourceTask]
      tutorial-connect-1    | 2023-03-20 08:55:02,537 INFO   ||     tasks.max = 1   [io.debezium.connector.common.BaseSourceTask]
      tutorial-connect-1    | 2023-03-20 08:55:02,537 INFO   ||     database.hostname = postgres   [io.debezium.connector.common.BaseSourceTask]
      tutorial-connect-1    | 2023-03-20 08:55:02,537 INFO   ||     database.password = ********   [io.debezium.connector.common.BaseSourceTask]
      tutorial-connect-1    | 2023-03-20 08:55:02,537 INFO   ||     name = inventory-connector   [io.debezium.connector.common.BaseSourceTask]
      tutorial-connect-1    | 2023-03-20 08:55:02,537 INFO   ||     schema.include.list = inventory   [io.debezium.connector.common.BaseSourceTask]
      tutorial-connect-1    | 2023-03-20 08:55:02,538 INFO   ||     database.port = 5432   [io.debezium.connector.common.BaseSourceTask]
      tutorial-connect-1    | 2023-03-20 08:55:02,538 INFO   ||  Loading the custom topic naming strategy plugin: io.debezium.schema.SchemaTopicNamingStrategy   [io.debezium.config.CommonConnectorConfig]
      tutorial-connect-1    | 2023-03-20 08:55:04,858 ERROR  ||  WorkerSourceTask{id=inventory-connector-0} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted   [org.apache.kafka.connect.runtime.WorkerTask]
      tutorial-connect-1    | io.debezium.DebeziumException: Couldn't obtain encoding for database postgres
      tutorial-connect-1    |         at io.debezium.connector.postgresql.connection.PostgresConnection.getDatabaseCharset(PostgresConnection.java:488)
      tutorial-connect-1    |         at io.debezium.connector.postgresql.PostgresConnectorTask.start(PostgresConnectorTask.java:75)
      tutorial-connect-1    |         at io.debezium.connector.common.BaseSourceTask.start(BaseSourceTask.java:136)
      tutorial-connect-1    |         at io.debezium.connector.common.BaseSourceTask.startIfNeededAndPossible(BaseSourceTask.java:235)
      tutorial-connect-1    |         at io.debezium.connector.common.BaseSourceTask.poll(BaseSourceTask.java:158)
      tutorial-connect-1    |         at org.apache.kafka.connect.runtime.AbstractWorkerSourceTask.poll(AbstractWorkerSourceTask.java:452)
      tutorial-connect-1    |         at org.apache.kafka.connect.runtime.AbstractWorkerSourceTask.execute(AbstractWorkerSourceTask.java:346)
      tutorial-connect-1    |         at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:189)
      tutorial-connect-1    |         at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:244)
      tutorial-connect-1    |         at org.apache.kafka.connect.runtime.AbstractWorkerSourceTask.run(AbstractWorkerSourceTask.java:72)
      tutorial-connect-1    |         at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
      tutorial-connect-1    |         at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
      tutorial-connect-1    |         at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
      tutorial-connect-1    |         at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
      tutorial-connect-1    |         at java.base/java.lang.Thread.run(Thread.java:829)
      tutorial-connect-1    | Caused by: org.postgresql.util.PSQLException: The connection attempt failed.
      tutorial-connect-1    |         at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:331)
      tutorial-connect-1    |         at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:49)
      tutorial-connect-1    |         at org.postgresql.jdbc.PgConnection.<init>(PgConnection.java:247)
      tutorial-connect-1    |         at org.postgresql.Driver.makeConnection(Driver.java:434)
      tutorial-connect-1    |         at org.postgresql.Driver.connect(Driver.java:291)
      tutorial-connect-1    |         at io.debezium.jdbc.JdbcConnection.lambda$patternBasedFactory$1(JdbcConnection.java:240)
      tutorial-connect-1    |         at io.debezium.jdbc.JdbcConnection$ConnectionFactoryDecorator.connect(JdbcConnection.java:126)
      tutorial-connect-1    |         at io.debezium.jdbc.JdbcConnection.connection(JdbcConnection.java:872)
      tutorial-connect-1    |         at io.debezium.jdbc.JdbcConnection.connection(JdbcConnection.java:867)
      tutorial-connect-1    |         at io.debezium.connector.postgresql.connection.PostgresConnection.getDatabaseCharset(PostgresConnection.java:485)
      tutorial-connect-1    |         ... 14 more
      tutorial-connect-1    | Caused by: java.net.UnknownHostException: postgres
      tutorial-connect-1    |         at java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:229)
      tutorial-connect-1    |         at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
      tutorial-connect-1    |         at java.base/java.net.Socket.connect(Socket.java:609)
      tutorial-connect-1    |         at org.postgresql.core.PGStream.createSocket(PGStream.java:241)
      tutorial-connect-1    |         at org.postgresql.core.PGStream.<init>(PGStream.java:98)
      tutorial-connect-1    |         at org.postgresql.core.v3.ConnectionFactoryImpl.tryConnect(ConnectionFactoryImpl.java:109)
      tutorial-connect-1    |         at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:235)
      tutorial-connect-1    |         ... 23 more
      tutorial-connect-1    | 2023-03-20 08:55:04,859 INFO   ||  Stopping down connector   [io.debezium.connector.common.BaseSourceTask] 

      How to reproduce the issue using our tutorial deployment?

      cd tutorial
      
      docker-compose -f docker-compose-postgres.yaml up
      
      curl -i -X POST -H "Accept:application/json" -H  "Content-Type:appl
      ication/json" http://localhost:8083/connectors/ -d @register-postgres.json
      
      docker stop tutorial-postgres-1

      Attachments

        Issue Links

          Activity

            People

              ccranfor@redhat.com Chris Cranford
              mrluje Vincent D (Inactive)
              Votes:
              2 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: