• False
    • None
    • False

      Feature request or enhancement

      For feature requests or enhancements, provide this information, please:

      Which use case/requirement will be addressed by the proposed feature?

      • Create a materialized table using an ETL-type solution like Apache Flink

      Implementation ideas (optional)

      VStream API has a VStream Copy feature, which allows the client to perform an initial consistent snapshot.

      To use VStream Copy, the connector just needs to pass an empty string as GTID. Once VTGate finishes copying tables for given shards, VStream API automatically starts streaming the changes from the position where the copy was done.

      The connector should consider the following two points:

      1. The event state machine in gRPC responses varies from the existing replication phase. Specifically, the connector receives duplicate BEGIN and VGTID events during a copy phase.
      2. The specification for resuming the failed copy process also varies from the existing one for a replication phase. Specifically, the connector must handle a new parameter called "table_p_ks" in VGTID and a new event called "COPY_COMPLETED."

      Now that adjusting the first point is trivial, allowing the connector to call VStream Copy is straightforward.

      On the other hand, the second point requires the upcoming Vitess 16, as I've realized that the current Vitess lacks some critical features for the connector to resume API correctly. The discussion is here: https://vitess.slack.com/archives/C0PQY0PTK/p1661155731380559.

      Consequently, I propose to stagger this feature implementation by these two PRs:

      1. To support the snapshot feature without a snapshot retry, meaning the connector restarts streaming the changes from the latest position when the failure occurs. Once a failure occurs while taking the snapshot, the users must restart it manually from the beginning. This first PR can be created at the moment.
      2. To support the snapshot feature with an appropriate retry, meaning the connector resumes copying the table when the failure occurs. This second PR depends on Vitess 16 and later.

      I think the snapshot feature lacking an automatic retry is still acceptable because the snapshot is performed only at the beginning and so the users can monitor the progress easily through either Vitess and Debezium loggings.

            [DBZ-5930] Vitess: Support snapshot feature

            Errata Tool added a comment -

            Since the problem described in this issue should be resolved in a recent advisory, it has been closed.

            For information on the advisory (Red Hat build of Debezium 2.1.4 release), and where to find the updated files, follow the link below.

            If the solution does not work for you, open a new bug report.
            https://access.redhat.com/errata/RHEA-2023:1814

            Errata Tool added a comment - Since the problem described in this issue should be resolved in a recent advisory, it has been closed. For information on the advisory (Red Hat build of Debezium 2.1.4 release), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2023:1814

            Released

            Debezium Builder added a comment - Released

            Hi, I am not sur eit is necessary to implement it in the code via snpashot interfaces. What I'd like to see just the same configuration for the user, the implementation can be different for the time being if the framework model does not fit to Vitess way of things.

            Jiri Pechanec added a comment - Hi, I am not sur eit is necessary to implement it in the code via snpashot interfaces. What I'd like to see just the same configuration for the user, the implementation can be different for the time being if the framework model does not fit to Vitess way of things.

            I am not very familiar with the debezium snapshot feature, I think it probably will take some time to implement all the methods/interfaces needed for SnapshotMode.  jpechane is very familiar with those interfaces, maybe he can chime in.

            I think if we decide not to implement debezium snapshot interface this time, we probably should rename this Jira/PR to not to use the word 'snapshot' (maybe something just like: Vitess VStream Copy integration).  Maybe somewhere down the line someone will have time to actually implement the debezium snapshot interface. 

            Henry Haiying Cai (Inactive) added a comment - I am not very familiar with the debezium snapshot feature, I think it probably will take some time to implement all the methods/interfaces needed for SnapshotMode.   jpechane  is very familiar with those interfaces, maybe he can chime in. I think if we decide not to implement debezium snapshot interface this time, we probably should rename this Jira/PR to not to use the word 'snapshot' (maybe something just like: Vitess VStream Copy integration).  Maybe somewhere down the line someone will have time to actually implement the debezium snapshot interface. 

            haiyingcai It's possible, but I don't have plans to build the feature on SnapshotMode and SnapshotsChangeEventSource.

            VStream Copy integrates the existing VStream API smoothly, and the API does the heavy lifting, unlike, for example, MySQL.
            I'm afraid that hooking into SnapshotsChangeEventSource will generate a boilerplate at the moment.

            Yohei Yoshimuta (Inactive) added a comment - haiyingcai  It's possible, but I don't have plans to build the feature on SnapshotMode and SnapshotsChangeEventSource. VStream Copy integrates the existing VStream API smoothly, and the API does the heavy lifting, unlike, for example, MySQL. I'm afraid that hooking into SnapshotsChangeEventSource will generate a boilerplate at the moment.

            yoheimuta , Debezium has SnapshotMode and SnapshotsChangeEventSource, do you want to hook into that part for debezium snapshots?

            Henry Haiying Cai (Inactive) added a comment - yoheimuta  , Debezium has SnapshotMode and SnapshotsChangeEventSource, do you want to hook into that part for debezium snapshots?

            Yohei Yoshimuta (Inactive) added a comment - I created the corresponding PR: https://github.com/debezium/debezium-connector-vitess/pull/112

              Unassigned Unassigned
              yoheimuta Yohei Yoshimuta (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: