Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-5070

Data duplication problem using postgresql source on debezium server

XMLWordPrintable

      In order to make your issue reports as actionable as possible, please provide the following information, depending on the issue type.

      Bug report

      Debezium Server 

      Data duplication problem using postgresql source on debezium server.

      When debezium server is executed for the second time, the data of the last transaction comes in once again. From the 3rd run onwards it is normal.

       

      case1 )

      1) Create a table in postgresql.
      2) Load 5 rows of data with the Copy command.
      3) Execute cdc job with debezium server.
      4) Row 5 comes in with opcode "r".
      5) Stop the debezium server.
      6) Run the debezium server again.
      7) Row 5 comes in with opcode "c".   <= Duplicate data.( snapshot data )
      8) Stop the debezium server.
      9) Run the debezium server again.
      10) No data coming in.  ( Noraml )

       

      case2 ) 

      1) Create a table in postgresql.
      2) Load 5 rows of data with the Copy command.
      3) One additional row is loaded with SQL.
      3) Execute cdc job with debezium server.
      4) Row 6 comes in with opcode "r".
      5) Stop the debezium server.
      6) Run the debezium server again.
      7) Row 1 comes in with opcode "c".   <= Duplicate data. ( insert data )
      8) Stop the debezium server.
      9) Run the debezium server again.
      10) No data coming in.  ( Noraml )

      What Debezium connector do you use and what version?

      Debezium Server : 1.8.1 final and 1.9.1 final 

      What is the connector configuration?

      debezium.source.connector.class=io.debezium.connector.postgresql.PostgresConnector
      debezium.source.offset.storage.file.filename=data/job1-offset.dat
      debezium.source.offset.flush.interval.ms=0
      debezium.source.database.hostname=localhost
      debezium.source.database.port=15432
      debezium.source.database.user=user
      debezium.source.database.password=password
      debezium.source.database.dbname=testdb
      debezium.source.database.server.name=postgre-job1
      debezium.source.schema.include.list=public
      debezium.source.table.include.list=public.housing_ny_sample
      debezium.source.max.queue.size=8192
      debezium.source.max.batch.size=2048
      debezium.source.snapshot.mode=initial

      What is the captured database version and mode of depoyment?

      debezium-postgresql:1.9

      What behaviour do you expect?

      I hope there are no duplicate events coming in.

       

      What behaviour do you see?

      When debezium server is executed for the second time, the data of the last transaction comes in once again. From the 3rd run onwards it is normal.

      Do you see the same behaviour using the latest relesead Debezium version?

      Both 1.8.1 and 1.9.1 are the same.

       

       

      In Mysql, that doesn't happen. It only happens in Postgresql.
      Do you have any additional settings?

       

            vjuranek@redhat.com Vojtech Juranek
            idpdh82 suho park (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: