Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-3429

SQL Server property (snapshot.select.statement.overrides) only matches 1st entry if comma-separated list also contains spaces

XMLWordPrintable

    • False
    • False
    • Undefined
    • Hide

      Using the Kinesis example as a guide, change the DB connector from MySql to SqlServer.  Define the following property configurations.  (Notice the space after the comma between each list entry for the properties)
      props.setProperty("table.include.list", "dbo.Patients, dbo.Visits, dbo.VisitActivities");
      props.setProperty("snapshot.select.statement.overrides", "dbo.Patients, dbo.Visits, dbo.VisitActivities");
      props.setProperty("snapshot.select.statement.overrides.dbo.Patients", "SELECT * FROM dbo.Patients WHERE DateCreated > '01/01/2021 00:00:00.000'");
      props.setProperty("snapshot.select.statement.overrides.dbo.Visits", "SELECT * FROM dbo.Visits WHERE DateCreated > '01/01/2021 00:00:00.000'");
      props.setProperty("snapshot.select.statement.overrides.dbo.VisitActivities", "SELECT * FROM dbo.VisitActivities WHERE ActivityDate > '01/01/2021 00:00:00.000'");
       
      Run the example application and review the application logs.  Verify that the properties are set for the SqlServerConnectorTask.  However, when the snapshot selects are performed, the override select statement is only being applied to the first table.
       
      Here is log statement for the first table (dbo.Patients):
      2021-04-13 16:49:22,628 INFO   ||    For table 'MMM360DB.dbo.Patients' using select statement: 'SELECT [Patients].[PatientKey],... FROM dbo.Patients WHERE DateCreated > '01/01/2021 00:00:00.000''   [io.debezium.relational.RelationalSnapshotChangeEventSource]
       
      Here is the abbreviated log statement for the other tables (dbo.Visits, dbo.VisitActivities):
      2021-04-13 16:49:22,739 INFO   ||    For table 'MMM360DB.dbo.Visits' using select statement: 'SELECT [Visits].[VisitKey],[Visits].[FacilityKey],... FROM [dbo].[Visits]'   [io.debezium.relational.RelationalSnapshotChangeEventSource]
       
      2021-04-13 16:49:23,389 INFO   ||    For table 'MMM360DB.dbo.VisitActivities' using select statement: 'SELECT [VisitActivities].[VisitActivityKey],... FROM [dbo].[VisitActivities]'   [io.debezium.relational.RelationalSnapshotChangeEventSource]
       
      Notice that the specific SELECT statement is only being used for the first entry.
       
      After reviewing the code for loading the snapshot.select.statement.overrides property, the problem is in the use of the String.split(",") to create an array of each entries.  Because the split is being performed only on the comma, the entries after the first entry will begin with a space...which will not match to the other select properties.
       

      Show
      Using the Kinesis example as a guide, change the DB connector from MySql to SqlServer.  Define the following property configurations.  (Notice the space after the comma between each list entry for the properties) props.setProperty("table.include.list", "dbo.Patients, dbo.Visits, dbo.VisitActivities"); props.setProperty("snapshot.select.statement.overrides", "dbo.Patients, dbo.Visits, dbo.VisitActivities"); props.setProperty("snapshot.select.statement.overrides.dbo.Patients", "SELECT * FROM dbo.Patients WHERE DateCreated > '01/01/2021 00:00:00.000'"); props.setProperty("snapshot.select.statement.overrides.dbo.Visits", "SELECT * FROM dbo.Visits WHERE DateCreated > '01/01/2021 00:00:00.000'"); props.setProperty("snapshot.select.statement.overrides.dbo.VisitActivities", "SELECT * FROM dbo.VisitActivities WHERE ActivityDate > '01/01/2021 00:00:00.000'");   Run the example application and review the application logs.  Verify that the properties are set for the SqlServerConnectorTask.  However, when the snapshot selects are performed, the override select statement is only being applied to the first table.   Here is log statement for the first table (dbo.Patients): 2021-04-13 16:49:22,628 INFO   ||    For table 'MMM360DB.dbo.Patients' using select statement: 'SELECT [Patients] . [PatientKey] ,... FROM dbo.Patients  WHERE DateCreated > '01/01/2021 00:00:00.000''     [io.debezium.relational.RelationalSnapshotChangeEventSource]   Here is the abbreviated log statement for the other tables (dbo.Visits, dbo.VisitActivities): 2021-04-13 16:49:22,739 INFO   ||    For table 'MMM360DB.dbo.Visits' using select statement: 'SELECT [Visits] . [VisitKey] , [Visits] . [FacilityKey] ,... FROM [dbo] . [Visits] '    [io.debezium.relational.RelationalSnapshotChangeEventSource]   2021-04-13 16:49:23,389 INFO   ||    For table 'MMM360DB.dbo.VisitActivities' using select statement: 'SELECT [VisitActivities] . [VisitActivityKey] ,... FROM [dbo] . [VisitActivities] '    [io.debezium.relational.RelationalSnapshotChangeEventSource]   Notice that the specific SELECT statement is only being used for the first entry.   After reviewing the code for loading the snapshot.select.statement.overrides property, the problem is in the use of the String.split(",") to create an array of each entries.  Because the split is being performed only on the comma, the entries after the first entry will begin with a space...which will not match to the other select properties.  

      Many of the properties for the DB connectors allows for a comma-separated list of values (i.e. table.include.list, snapshot.select.statement.overrides).  However, the processing of these values is done very differently for each property.  If spaces are also included in the comma-separated list for the table.include.list it works just fine.  However, if spaces are included in the list for property snapshot.select.statement.overrides, it doesn't work.
       

              Unassigned Unassigned
              kyleyjexdev Kyley Jex (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: