Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-1245

Postgres connector failing because empty state data is being stored in offsets topic


    • Steps to Reproduce:

      Seems to randomly happen shortly after new PG connectors using WAL2JSON are created.

      Seems to randomly happen shortly after new PG connectors using WAL2JSON are created.


      Sometimes a PG connector (using the WAL2JSON decoder) task can get into a weird state when it is restarted. I am seeing a message like this:

      {"name":"my-connector","connector":{"state":"RUNNING","worker_id":"localhost:8083"},"tasks":[{"id":0,"state":"FAILED","worker_id":"localhost:8083","trace":"java.lang.NullPointerException\n\tat io.debezium.connector.postgresql.SourceInfo.load(SourceInfo.java:132)\n\tat io.debezium.connector.postgresql.PostgresConnectorTask.start(PostgresConnectorTask.java:109)\n\tat io.debezium.connector.common.BaseSourceTask.start(BaseSourceTask.java:49)\n\tat org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:198)\n\tat org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175)\n\tat org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219)\n\tat java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\n"}],"type":"source"}

      Restarting the connector task more does not seem to help. It looks like the state data pullout off of the offsets topic is empty and there is no LSN to grab. I looked through the offsets topic and here is what the message looks:

      offset 27694: key ["my-connector",


      ]: {}

      I can get the connector working by manually writing an LSN of 0 to the partition and restarting the connector.

      I am not sure what is causing this empty data to be written or if it is related to the recent change I made to make the heartbeat fire for all events. The empty should not be written but maybe the SourceTask error logic should be improved so that the connector falls back to getting the LSN from the slot when it cannot get the LSN from the offsets topic.

        Gliffy Diagrams




              • Assignee:
                jpechanec Jiri Pechanec
                trolison Taylor Rolison
              • Votes:
                0 Vote for this issue
                2 Start watching this issue


                • Created: