Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-4433

Support PostgreSQL multi-level partitions of tables

    XMLWordPrintable

Details

    • Bug
    • Resolution: Not a Bug
    • Critical
    • None
    • 1.7.0.Final
    • postgresql-connector
    • None
    • False
    • False

    Description

      We use lot of tables on PostgreSQL (13.5) and Debezium (1.7.0) read changes from WAL logs and save it to Kafka (kafka_2.13-2.8.0) topics.
      This work fine for table with one level partition of table, for example:
      – Root table
      CREATE TABLE stage.d_v_operator (
          df_operator_key int4 NOT NULL,
          df_druh_hry varchar NOT NULL,
          df_trigger_off_sdc2_group_history bool NULL,
          df_created timestamptz NOT NULL DEFAULT now(),
          df_valid bool NOT NULL DEFAULT true,
          df_v_operator_key int8 NOT NULL,
          idoperator varchar NULL,
          operatornazev varchar NULL,
          df_group_00_df_master_package_key int8 NULL,
          CONSTRAINT d_v_operator_pk PRIMARY KEY (df_operator_key, df_druh_hry, df_created, df_v_operator_key)
      )
      PARTITION BY RANGE (df_operator_key, df_druh_hry, df_created);

      – 1. level partition
      CREATE TABLE stage.d_v_operator_11223344_t_2112 PARTITION OF stage.d_v_operator FOR VALUES FROM (11223344, 'T', '2021-12-01 00:00:00+01') TO (11223344, 'T', '2021-12-31 23:59:59.999+01');
      Every table has using index of replica identity, for example:
      ALTER TABLE stage.d_v_operator REPLICA IDENTITY USING INDEX d_v_operator_pk;
      Now we have table with multi level partitions. Root table has 3 level partitions, for example below:
      – Root table
      CREATE TABLE stage.s_v_operator (
          df_package_key int8 NOT NULL,
          df_operator_key int4 NOT NULL,
          df_druh_hry varchar NOT NULL,
          obdobi tstzrange NULL,
          idoperator varchar NULL,
          operatornazev varchar NULL,
          df_v_operator_key int8 NOT NULL,
          rowhash text NOT NULL,
          CONSTRAINT s_v_operator_pkey PRIMARY KEY (df_package_key, df_v_operator_key, df_operator_key, df_druh_hry)
      )
      PARTITION BY LIST (df_druh_hry);

      – 1. level partition
      CREATE TABLE stage.s_v_operator_default PARTITION OF stage.s_v_operator DEFAULT;
      CREATE TABLE stage.s_v_operator_t PARTITION OF stage.s_v_operator FOR VALUES IN ('T') PARTITION BY LIST (df_operator_key);

      – 2. level partition
      CREATE TABLE stage.s_v_operator_t_11223344 PARTITION OF stage.s_v_operator_t FOR VALUES IN (11223344) PARTITION BY LIST (df_package_key);

      – 3. level partition
      CREATE TABLE stage.s_v_operator_t_11223344_269811 PARTITION OF stage.s_v_operator_t_11223344 FOR VALUES IN ('269811');
       
      Debezium not load changes to kafka for this tables. Log have not errors. We have schema and tables in white list in connector properties and this tables we have added to publication of PostgreSQL.
       
      This is my complete process of source database configuration below. This process is same for tables with one or multi level partitions.
      CREATE PUBLICATION my_pub WITH (publish = 'insert, update, delete, truncate', publish_via_partition_root = true);
      SELECT pg_create_logical_replication_slot('my_slot', 'pgoutput');
      ALTER PUBLICATION my_pub ADD TABLE stage.d_v_operator; – table with one level partition
      ALTER PUBLICATION my_pub ADD TABLE stage.s_v_operator; – table with three level partition
       
      We consulted our problem with PostgreSQL replication developers.

      We reproduced the changes in the stage.s_v_operator table using test_decoding, which we read via pg_logical_slot_get_changes. More in Postgres documentation https://www.postgresql.org/docs/13/logicaldecoding-example.html

      All operations were displayed correctly, so I would unfortunately see the problem on the Debezium side. For completeness, I am take here our configuration file

      name=connect-debezium-postgres-source
      plugin.name=pgoutput
      connector.class=io.debezium.connector.postgresql.PostgresConnector
      publication.name=my_pub
      slot.name=my_slot
      snapshot.mode=never
      database.hostname=...
      database.port=5432
      database.user=myuser
      database.password=****
      database.dbname=mydb
      database.server.name=myservername
      schema.include.list=ldz_master,rdt,stage,df_master
      table.include.list=ldz_master.df_package,df_master.lst_provozovatel,rdt.kurs_cnb,stage.d_v_operator,stage.s_v_operator
      delete.topic.enable=true
      decimal.handling.mode=string
      toasted.value.placeholder={"message":"__debezium_unavailable_value"}
      topic.creation.default.replication.factor=-1
      topic.creation.default.partitions=-1
      topic.creation.default.cleanup.policy=delete
      topic.creation.default.compression.type=snappy

       
      Our database has almost 40 TB so we do not use a snapshot.
       
      ---------------
       
      This problem was consulting on this google groups - https://groups.google.com/g/debezium/c/NroAVr2saxU
       
       
       

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            jan.kopic.cz@gmail.com Jan Kopic (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: