Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-932

A long running PostgresTaskContext.refreshSchema causes the replication stream to fail

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • 0.8.3.Final
    • postgresql-connector
    • None

      If `PostgresTaskContext.refreshSchema` takes longer than `wal_sender_timeout` to run the replication stream errors on the first call to `read`.

      org.apache.kafka.connect.errors.ConnectException: An exception ocurred in the change event producer. This connector will be stopped.
      	at io.debezium.connector.base.ChangeEventQueue.throwProducerFailureIfPresent(ChangeEventQueue.java:168)
      	at io.debezium.connector.base.ChangeEventQueue.poll(ChangeEventQueue.java:149)
      	at io.debezium.connector.postgresql.PostgresConnectorTask.poll(PostgresConnectorTask.java:146)
      	at org.apache.kafka.connect.runtime.WorkerSourceTask.poll(WorkerSourceTask.java:244)
      	at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:220)
      	at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175)
      	at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)
      Caused by: org.postgresql.util.PSQLException: Database connection failed when reading from copy
      	at org.postgresql.core.v3.QueryExecutorImpl.readFromCopy(QueryExecutorImpl.java:964)
      	at org.postgresql.core.v3.CopyDualImpl.readFromCopy(CopyDualImpl.java:41)
      	at org.postgresql.core.v3.replication.V3PGReplicationStream.receiveNextData(V3PGReplicationStream.java:145)
      	at org.postgresql.core.v3.replication.V3PGReplicationStream.readInternal(V3PGReplicationStream.java:114)
      	at org.postgresql.core.v3.replication.V3PGReplicationStream.read(V3PGReplicationStream.java:60)
      	at io.debezium.connector.postgresql.connection.PostgresReplicationConnection$1.read(PostgresReplicationConnection.java:198)
      	at io.debezium.connector.postgresql.RecordsStreamProducer.streamChanges(RecordsStreamProducer.java:128)
      	at io.debezium.connector.postgresql.RecordsStreamProducer.lambda$start$1(RecordsStreamProducer.java:114)
      	... 5 more
      Caused by: java.io.EOFException
      	at org.postgresql.core.PGStream.receiveChar(PGStream.java:284)
      	at org.postgresql.core.v3.QueryExecutorImpl.processCopyResults(QueryExecutorImpl.java:1006)
      	at org.postgresql.core.v3.QueryExecutorImpl.readFromCopy(QueryExecutorImpl.java:962)
      	... 12 more
      

      We run a tenant per schema which each schema having ~50 tables. Refreshing the schema takes over a minute.

      We have worked around this by using the fact that we set `schema.whitelist` to a single schema and passing that through to `connection.readSchema`.

      this.databaseCatalog = config.databaseName();
      String schemaWhitelist = config.schemaWhitelist();
      if (schemaWhitelist != null && !schemaWhitelist.isEmpty()) {
          // not valid if there is regex in this string or it has commas, postgresql puts this string into a LIKE
          this.schemaNamePattern = schemaWhitelist;
      } else {
          this.schemaNamePattern = null;
      }
      
      ...
      
      connection.readSchema(tables(), this.databaseCatalog, this.schemaNamePattern, filters.tableNameFilter(), null, true);
      

              Unassigned Unassigned
              sguymer Sam Guymer (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: