Type: Feature Request
Resolution: Done
Priority: Major
Fix Version/s: 2.1.0.Final
Affects Version/s: None
Component/s: vitess-connector
Labels:
- add-to-upgrade-guide

Blocked:
False
Blocked Reason:
None
Ready:
False
Target Release:

2.1.4.GA

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

Feature request or enhancement

For feature requests or enhancements, provide this information, please:

Which use case/requirement will be addressed by the proposed feature?

Create a materialized table using an ETL-type solution like Apache Flink

Implementation ideas (optional)

VStream API has a VStream Copy feature, which allows the client to perform an initial consistent snapshot.

To use VStream Copy, the connector just needs to pass an empty string as GTID. Once VTGate finishes copying tables for given shards, VStream API automatically starts streaming the changes from the position where the copy was done.

The connector should consider the following two points:

The event state machine in gRPC responses varies from the existing replication phase. Specifically, the connector receives duplicate BEGIN and VGTID events during a copy phase.
The specification for resuming the failed copy process also varies from the existing one for a replication phase. Specifically, the connector must handle a new parameter called "table_p_ks" in VGTID and a new event called "COPY_COMPLETED."

Now that adjusting the first point is trivial, allowing the connector to call VStream Copy is straightforward.

On the other hand, the second point requires the upcoming Vitess 16, as I've realized that the current Vitess lacks some critical features for the connector to resume API correctly. The discussion is here: https://vitess.slack.com/archives/C0PQY0PTK/p1661155731380559.

Consequently, I propose to stagger this feature implementation by these two PRs:

To support the snapshot feature without a snapshot retry, meaning the connector restarts streaming the changes from the latest position when the failure occurs. Once a failure occurs while taking the snapshot, the users must restart it manually from the beginning. This first PR can be created at the moment.
To support the snapshot feature with an appropriate retry, meaning the connector resumes copying the table when the failure occurs. This second PR depends on Vitess 16 and later.

I think the snapshot feature lacking an automatic retry is still acceptable because the snapshot is performed only at the beginning and so the users can monitor the progress easily through either Vitess and Debezium loggings.

Errata Tool added a comment - 2023/04/17 3:23 PM

Since the problem described in this issue should be resolved in a recent advisory, it has been closed.

For information on the advisory (Red Hat build of Debezium 2.1.4 release), and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHEA-2023:1814

Errata Tool added a comment - 2023/04/17 3:23 PM Since the problem described in this issue should be resolved in a recent advisory, it has been closed. For information on the advisory (Red Hat build of Debezium 2.1.4 release), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2023:1814

Debezium Builder added a comment - 2022/12/22 11:03 AM

Released

Debezium Builder added a comment - 2022/12/22 11:03 AM Released

Jiri Pechanec added a comment - 2022/12/15 6:52 AM

Hi, I am not sur eit is necessary to implement it in the code via snpashot interfaces. What I'd like to see just the same configuration for the user, the implementation can be different for the time being if the framework model does not fit to Vitess way of things.

Jiri Pechanec added a comment - 2022/12/15 6:52 AM Hi, I am not sur eit is necessary to implement it in the code via snpashot interfaces. What I'd like to see just the same configuration for the user, the implementation can be different for the time being if the framework model does not fit to Vitess way of things.

Henry Haiying Cai (Inactive) added a comment - 2022/12/15 5:46 AM

I am not very familiar with the debezium snapshot feature, I think it probably will take some time to implement all the methods/interfaces needed for SnapshotMode. jpechane is very familiar with those interfaces, maybe he can chime in.

I think if we decide not to implement debezium snapshot interface this time, we probably should rename this Jira/PR to not to use the word 'snapshot' (maybe something just like: Vitess VStream Copy integration). Maybe somewhere down the line someone will have time to actually implement the debezium snapshot interface.

Henry Haiying Cai (Inactive) added a comment - 2022/12/15 5:46 AM I am not very familiar with the debezium snapshot feature, I think it probably will take some time to implement all the methods/interfaces needed for SnapshotMode. jpechane is very familiar with those interfaces, maybe he can chime in. I think if we decide not to implement debezium snapshot interface this time, we probably should rename this Jira/PR to not to use the word 'snapshot' (maybe something just like: Vitess VStream Copy integration). Maybe somewhere down the line someone will have time to actually implement the debezium snapshot interface.

Yohei Yoshimuta (Inactive) added a comment - 2022/12/13 7:40 AM

haiyingcai It's possible, but I don't have plans to build the feature on SnapshotMode and SnapshotsChangeEventSource.

VStream Copy integrates the existing VStream API smoothly, and the API does the heavy lifting, unlike, for example, MySQL.
I'm afraid that hooking into SnapshotsChangeEventSource will generate a boilerplate at the moment.

Yohei Yoshimuta (Inactive) added a comment - 2022/12/13 7:40 AM haiyingcai It's possible, but I don't have plans to build the feature on SnapshotMode and SnapshotsChangeEventSource. VStream Copy integrates the existing VStream API smoothly, and the API does the heavy lifting, unlike, for example, MySQL. I'm afraid that hooking into SnapshotsChangeEventSource will generate a boilerplate at the moment.

Henry Haiying Cai (Inactive) added a comment - 2022/12/13 5:30 AM

yoheimuta , Debezium has SnapshotMode and SnapshotsChangeEventSource, do you want to hook into that part for debezium snapshots?

Henry Haiying Cai (Inactive) added a comment - 2022/12/13 5:30 AM yoheimuta , Debezium has SnapshotMode and SnapshotsChangeEventSource, do you want to hook into that part for debezium snapshots?

Yohei Yoshimuta (Inactive) added a comment - 2022/12/13 2:41 AM

I created the corresponding PR: https://github.com/debezium/debezium-connector-vitess/pull/112

Yohei Yoshimuta (Inactive) added a comment - 2022/12/13 2:41 AM I created the corresponding PR: https://github.com/debezium/debezium-connector-vitess/pull/112

Assignee:: Unassigned

Reporter:: Yohei Yoshimuta (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2022/12/13 2:11 AM

Updated:: 2023/04/17 3:23 PM

Resolved:: 2022/12/21 12:12 PM

Vitess: Support snapshot feature

Feature request or enhancement

Which use case/requirement will be addressed by the proposed feature?

Implementation ideas (optional)

[DBZ-5930] Vitess: Support snapshot feature

Details

Description

Feature request or enhancement

Which use case/requirement will be addressed by the proposed feature?

Implementation ideas (optional)

Attachments

Easy Agile Planning Poker

Activity

[DBZ-5930] Vitess: Support snapshot feature

Collapse comment: Errata Tool added a comment - 2023/04/17 3:23 PM

Expand comment: Errata Tool added a comment - 2023/04/17 3:23 PM

Collapse comment: Debezium Builder added a comment - 2022/12/22 11:03 AM

Expand comment: Debezium Builder added a comment - 2022/12/22 11:03 AM

Collapse comment: Jiri Pechanec added a comment - 2022/12/15 6:52 AM

Expand comment: Jiri Pechanec added a comment - 2022/12/15 6:52 AM

Collapse comment: Henry Haiying Cai (Inactive) added a comment - 2022/12/15 5:46 AM

Expand comment: Henry Haiying Cai (Inactive) added a comment - 2022/12/15 5:46 AM

Collapse comment: Yohei Yoshimuta (Inactive) added a comment - 2022/12/13 7:40 AM

Expand comment: Yohei Yoshimuta (Inactive) added a comment - 2022/12/13 7:40 AM

Collapse comment: Henry Haiying Cai (Inactive) added a comment - 2022/12/13 5:30 AM

Expand comment: Henry Haiying Cai (Inactive) added a comment - 2022/12/13 5:30 AM

Collapse comment: Yohei Yoshimuta (Inactive) added a comment - 2022/12/13 2:41 AM

Expand comment: Yohei Yoshimuta (Inactive) added a comment - 2022/12/13 2:41 AM

People

Dates