Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-3899

Debezium 2.0 input

    XMLWordPrintable

Details

    • Task
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • None
    • 2.0.0.Final
    • None
    • None

    Description

      This issue serves as loose umbrella for any potential changes in a future Debezium 2.0 version.

      • Change mapping of FLOAT to float32 (see DBZ-3865)
      • Align SMT options for adding additional fields/headers between outbox and flattening SMTs (see DBZ-3584, DBZ-4012)
      • The "tableChanges" field in the schema change event schema should be optional probably, as it isn't present for all kinds of schema change events (e.g. in case of CREATE SCHEMA some_schema; see DBZ-3536)
      • Remove masking format version V1 (see DBZ-4033)
      • The heartbeat topic name should be modified so that it defaults to <logical.server.name.>.transaction, achieving consistency with all the other topic names; we might want to support a placeholder for the logical server name in all the topic whose names contain it (see DBZ-4180)
      • Java 11 as a baseline for all connectors (DBZ-4949)
      • Use Java 17 for testing; that'd e.g. allow to use multi-line strings for JSON docs (DBZ-4949)
      • Drop support for wal2json; also decide whether to keep support for decoderbufs, or whether we should go all in on pgoutput
      • Potentially drop MongoDB oplog (DBZ-4951)
      • Revise event format holistically
      • Drop legacy implementation of the MySQL connector (BinlogReader et al.) (DBZ-4950)
      • Remove "database.server.id.offset" and "snapshot.new.tables" properties from MySQL connector (it's part of the legacy parallel snapshotting) (DBZ-4950)
      • Use "database.server.name" also for MongoDB (currently it's "mongodb.name"); or use something more abstract like "logical.name" for all the connectors (for the Cassandra connector, see connectorName() and kafkaTopicPrefix())
      • Remove Column::hasDefaultValue (see DBZ-4239)
      • Remove PostgreSQL option toasted.value.placeholder, deprecated in 1.8 by DBZ-4276.
      • Should we switch to pgoutput as the default PostgreSQL plug-in?
      • Remove wal2json plug-in (DBZ-4156)
      • Improve Table metamodel with a flag that indicates the table has LOB columns; useful for Oracle.
      • Could variable scale decimal be represented by schema parameter instead of struct?
      • We have quite a few similar validation methods, particularly around deprecated options, unify these?
      • Get rid of Field::withGroup() and related methods, which currently are used for grouping in the UI; replace this with properly organized config groups
      • Introduce specific namespace for database pass-thru properties to avoid possible tainting where some database.{} properties aren't meant to be pass-thru, i.e. new database.props. namespace.
      • Remove Confluent's Avro converters from Debezium Connect image
      • Make Oracle's skipped.operations default to "t" for consistent behavior with PG.
      • Remove deprecated internal.key.converter and internal.value.converter configuration options from Embedded Engine & its derivatives.
      • Debezium Connector Framework API
      • Make explicit rules for schema snapshotting related to the database history
      • align queue.size properties' values across connectors
        • How to handle system/bultin schemas - should it be recorded when builtin/system tables are ignored
        • Should schema change events be emitted for all captured tables or for all tables for which the schema history is recorded
      • Rename `docker-images` repository and JIRA component to `container-images`
        • cleanup docs with new links
        • cleanup build container image build jobs etc
      • Make interval.handling.mode should default to "string"
      • Re-use heartbeat topic for Debezium administrative messages - e.g. when snapshot is completed it is no longer needed to buffer the message to mark the last one but an end of snapshot message will be sent with the updated offsets
      • "table.ignore.builtin" should become an internal property and removed from docs (see DBZ-4052 comments)
      • Consider to default schema.adjustment.mode to "none" (see DBZ-3535); I'm a bit on the fence, arguably, the current default behavior (adjusting for Avro conventions) has a better getting started experience
      • Use SchemaGenerator groups (Field.withGroup()) for Kafka Connect configuration groups DBZ-4915
      • Potentially de-incubate components, e.g. Debezium Server
      • Reverse tthe logic of handling retriable errors - retry by default
      • Explore whether we truly need the logical connector name property; could it at least default to the connector name itself, if not given? In any case, it should be moved out of the "database.*" and its purpose be made more clear, e.g. "namespace" or something like that; the concern about using the KC connector name as a default is that it exposes a technical infrastructure detail in the API of emitted events (as the namespace is part of topic and schema names)
      • The logical server name probably can also be omitted from the connector offset format, as it's never read back
      • Consider whether only CDC-enabled tables should be snapshotted by default (see DBZ-4202); another option: have a special constant to be used as include filter, which would indicate that the filters should be obtained from the DB (either CDC-enabled tables for SQL Server, or publication tables for PG)
      • Remove unused code from Oracle connector (see DBZ-4973)

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              gmorling@redhat.com Gunnar Morling
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated: