Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-5572

Vitess: Filter table.include.list during VStream subscription

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Done
    • Icon: Major Major
    • 2.0.0.Beta2
    • 1.9.5.Final
    • vitess-connector
    • None

      In order to make your issue reports as actionable as possible, please provide the following information, depending on the issue type.

      Bug report

      For bug reports, provide this information, please:

      What Debezium connector do you use and what version?

      vitess-connector

      What is the connector configuration?

      "table.include.list": "byuser.channels_members"

      What is the captured database version and mode of depoyment?

      AWS

      What behaviour do you expect?

      When table.include.list was used to only subscribe to changes to a subset of tables in the database, the filtering is currently in debezium VM.  This is less efficient comparing to the filtering at the VtTablet level.   During VStream subscription, you can specify which tables you are interested, the VtTable will only send the changes to these tables to you.  The benefit of doing filtering in VtTable level is there are much less network bytes sending over the network, it also has the advantage of avoiding errors related to tables you are not interested.  For example, we had seen errors related to schema mismatch on gh-ost (online schema migration) metadata table, we are not really interested in those tables.  By specifying to subscribe to only the data tables we are interested in, we can avoid those problems as well.

      What behaviour do you see?

      Filtering is not done at VStream subscription time and sometimes we got errors related to other tables in the db.

      Do you see the same behaviour using the latest relesead Debezium version?

      Yes

      Do you have the connector logs, ideally from start till finish?

      Errors like the following (that _ghc table is the temp table created during gh-ost migration process)

      VStream streaming onError. Status: Status{code=UNKNOWN, description=target: byuser.-4000.replica: vttablet: rpc error: code = Unknown desc = stream (at source tablet) error @ 08fb1cf3-0ce5-11ed-b921-0a8939501751:1-1443715: unknown table _mentions_unread_ghc in schema, cause=null}

      How to reproduce the issue using our tutorial deployment?

      <Your answer>

      Feature request or enhancement

      For feature requests or enhancements, provide this information, please:

      Which use case/requirement will be addressed by the proposed feature?

      <Your answer>

      Implementation ideas (optional)

      You can specify the list of tables during VStream subscription

            Unassigned Unassigned
            haiyingcai Henry Haiying Cai (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: