Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-6037

Debezium is logging the full message along with the error

XMLWordPrintable

    • False
    • None
    • False

      debezium is logging the full message/record (not good for data that has PII) along with the error message.

      Bug report

      For bug reports, provide this information, please:

      What Debezium connector do you use and what version?

      2.0.1 Postgresql

      What is the connector configuration?

      tasks.max: 1
      database.hostname: 127.0.0.1
      plugin.name: pgoutput
      slot.name: db_pslimit
      topic.prefix: karan_test6
      database.port: 3306
      database.user: replication_user
      database.dbname: test_debezium_database
      table.include.list: public.pslimit1, public.pslimit2

      signal.data.collection: public.signalling
      publication.name: db_pslimit
      publication.autocreate.mode: filtered

      transforms: unwrap
      transforms.unwrap.add.fields: op,table,source.ts_ms, source.txId, source.snapshot, source.lsn, source.xmin, source.db, source.version
      transforms.unwrap.delete.handling.mode: rewrite
      transforms.unwrap.drop.tombstones: false
      transforms.unwrap.type: io.debezium.transforms.ExtractNewRecordState

      value.converter.schemas.enable: true
      value.converter: io.confluent.connect.avro.AvroConverter
      value.converter.schema.registry.url: <>
      value.converter.schema.registry.ssl.keystore.location: /opt/kafka/external-configuration/schema-registry-keystore/schemaregistry-client-keystore.jks
      value.converter.schema.registry.ssl.keystore.password: "${file:/opt/kafka/external-configuration/schema-registry-storepass/connector.properties:storepass}"
      value.converter.schema.registry.ssl.key.password: "${file:/opt/kafka/external-configuration/schema-registry-storepass/connector.properties:keypass}"
      value.converter.schema.registry.ssl.keystore.type: PKCS12

      key.converter.schemas.enable: true
      key.converter: io.confluent.connect.avro.AvroConverter
      key.converter.schema.registry.url: <redacted>
      key.converter.schema.registry.ssl.keystore.location: /opt/kafka/external-configuration/schema-registry-keystore/schemaregistry-client-keystore.jks
      key.converter.schema.registry.ssl.keystore.password: "${file:/opt/kafka/external-configuration/schema-registry-storepass/connector.properties:storepass}"
      key.converter.schema.registry.ssl.key.password: "${file:/opt/kafka/external-configuration/schema-registry-storepass/connector.properties:keypass}"
      key.converter.schema.registry.ssl.keystore.type: PKCS12

      What is the captured database version and mode of depoyment?

      cloudsql postgres GCP 1.14

      What behaviour do you expect?

      ERROR gets logged but it should not log a the full message/record in the logs as the record may contain (as in our case) PII data

      What behaviour do you see?

      2023-01-24 14:46:58,596 ERROR [debezium-connector-karan-pslimit6|task-0] Failed to properly convert data value for 'public.pslimit1.fee' of type numeric for row [142, null, NAN, true, <this could be sensitive data>]: (io.debezium.relational.TableSchemaBuilder) [debezium-postgresconnector-karan_test6-change-event-source-coordinator]

      As you can see above the full message gets logged in the logs, which is not ideal in case data contains PII information.

       

      Do you see the same behaviour using the latest relesead Debezium version?

      Untested

      Do you have the connector logs, ideally from start till finish?

      (You might be asked later to provide DEBUG/TRACE level log)

      <Your answer>

      How to reproduce the issue using our tutorial deployment?

      1. Create a table in postgres, which has a `numeric` data-type  column along with any other column.
      2. Insert a record in that `numeric` col with a value of `NaN` - postgres accepts it. This will create an ERROR message
      3. check the logs of your kafka connect as it streams that data. The error message logs the full record with all the values of the message.

       

      Feature request or enhancement

      The error logs should not by default show the full message (data). In case of schema enabled, it can show the id (key) but logging the full message/data is risky when it comes to handling PII. As essentially, one may end up logging PII data.

      Which use case/requirement will be addressed by the proposed feature?

      <Your answer>

      Implementation ideas (optional)

      <Your answer>

              Unassigned Unassigned
              ee07dazn Karan Rewari (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: