Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-5930

Vitess: Support snapshot feature

XMLWordPrintable

    • False
    • None
    • False

      Feature request or enhancement

      For feature requests or enhancements, provide this information, please:

      Which use case/requirement will be addressed by the proposed feature?

      • Create a materialized table using an ETL-type solution like Apache Flink

      Implementation ideas (optional)

      VStream API has a VStream Copy feature, which allows the client to perform an initial consistent snapshot.

      To use VStream Copy, the connector just needs to pass an empty string as GTID. Once VTGate finishes copying tables for given shards, VStream API automatically starts streaming the changes from the position where the copy was done.

      The connector should consider the following two points:

      1. The event state machine in gRPC responses varies from the existing replication phase. Specifically, the connector receives duplicate BEGIN and VGTID events during a copy phase.
      2. The specification for resuming the failed copy process also varies from the existing one for a replication phase. Specifically, the connector must handle a new parameter called "table_p_ks" in VGTID and a new event called "COPY_COMPLETED."

      Now that adjusting the first point is trivial, allowing the connector to call VStream Copy is straightforward.

      On the other hand, the second point requires the upcoming Vitess 16, as I've realized that the current Vitess lacks some critical features for the connector to resume API correctly. The discussion is here: https://vitess.slack.com/archives/C0PQY0PTK/p1661155731380559.

      Consequently, I propose to stagger this feature implementation by these two PRs:

      1. To support the snapshot feature without a snapshot retry, meaning the connector restarts streaming the changes from the latest position when the failure occurs. Once a failure occurs while taking the snapshot, the users must restart it manually from the beginning. This first PR can be created at the moment.
      2. To support the snapshot feature with an appropriate retry, meaning the connector resumes copying the table when the failure occurs. This second PR depends on Vitess 16 and later.

      I think the snapshot feature lacking an automatic retry is still acceptable because the snapshot is performed only at the beginning and so the users can monitor the progress easily through either Vitess and Debezium loggings.

              Unassigned Unassigned
              yoheimuta Yohei Yoshimuta (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: