Uploaded image for project: 'Teiid'
  1. Teiid
  2. TEIID-5765

Criteria Pushdown is not working on FULL OUTER JOIN

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Blocker Blocker
    • 12.3, 12.1.2, 12.2.1
    • 12.0
    • Query Engine
    • None
    • Hide

      1. In MySQL create the test_a table in test_tables and dwh schemas using the following scripts:

      CREATE TABLE `dwh`.`test_a` (
        `a` int(11) DEFAULT NULL,
        `b` int(11) DEFAULT NULL
      ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
      INSERT INTO `dwh`.`test_a` (`a`,`b`) VALUES (1,1);
      INSERT INTO `dwh`.`test_a` (`a`,`b`) VALUES (1,2);
      INSERT INTO `dwh`.`test_a` (`a`,`b`) VALUES (2,1);
      INSERT INTO `dwh`.`test_a` (`a`,`b`) VALUES (2,2);
      INSERT INTO `dwh`.`test_a` (`a`,`b`) VALUES (3,2);
      INSERT INTO `dwh`.`test_a` (`a`,`b`) VALUES (3,10);
      
      CREATE TABLE `test_tables`.`test_a` (
        `a` int(11) DEFAULT NULL,
        `b` int(11) DEFAULT NULL
      ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
      INSERT INTO `test_tables`.`test_a` (`a`,`b`) VALUES (1,1);
      INSERT INTO `test_tables`.`test_a` (`a`,`b`) VALUES (1,2);
      INSERT INTO `test_tables`.`test_a` (`a`,`b`) VALUES (2,1);
      INSERT INTO `test_tables`.`test_a` (`a`,`b`) VALUES (2,2);
      INSERT INTO `test_tables`.`test_a` (`a`,`b`) VALUES (3,2);
      INSERT INTO `test_tables`.`test_a` (`a`,`b`) VALUES (3,10);
      

      2. Add MySQL database configuration in standalone-teiid.xml:

                      <datasource jndi-name="java:/test_dwh_my" pool-name="test_dwh_my" enabled="true" use-java-context="true">
                          <connection-url>jdbc:mysql://localhost:3306/dwh?zeroDateTimeBehavior=convertToNull</connection-url>
                          <driver>mysql</driver>
                          <new-connection-sql>set SESSION sql_mode = 'ANSI'</new-connection-sql>
                          <pool>
                              <min-pool-size>2</min-pool-size>
                              <max-pool-size>70</max-pool-size>
                          </pool>
                          <security>
                              <user-name>root</user-name>
                              <password>XXXXXX</password>
                          </security>
                          <validation>
                              <valid-connection-checker class-name="org.jboss.jca.adapters.jdbc.extensions.mysql.MySQLValidConnectionChecker"/>
                              <exception-sorter class-name="org.jboss.jca.adapters.jdbc.extensions.mysql.MySQLExceptionSorter"/>
                          </validation>
                          <timeout>
                              <blocking-timeout-millis>120000</blocking-timeout-millis>
                              <idle-timeout-minutes>5</idle-timeout-minutes>
                          </timeout>
                      </datasource>
                      <datasource jndi-name="java:/test" pool-name="test" enabled="true" use-java-context="true">
                          <connection-url>jdbc:mysql://localhost:3306/test_tables?zeroDateTimeBehavior=convertToNull</connection-url>
                          <driver>mysql</driver>
                          <new-connection-sql>set SESSION sql_mode = 'ANSI'</new-connection-sql>
                          <pool>
                              <min-pool-size>2</min-pool-size>
                              <max-pool-size>70</max-pool-size>
                          </pool>
                          <security>
                              <user-name>root</user-name>
                              <password>XXXXXX</password>
                          </security>
                          <validation>
                              <valid-connection-checker class-name="org.jboss.jca.adapters.jdbc.extensions.mysql.MySQLValidConnectionChecker"/>
                              <exception-sorter class-name="org.jboss.jca.adapters.jdbc.extensions.mysql.MySQLExceptionSorter"/>
                          </validation>
                          <timeout>
                              <blocking-timeout-millis>120000</blocking-timeout-millis>
                              <idle-timeout-minutes>5</idle-timeout-minutes>
                          </timeout>
                      </datasource>
      

      3. Add in test-vdb.xml java:/test_dwh_my and java:/test configured in previous step as datasources:

          <model name="dwh_my">
              <property name="importer.useFullSchemaName" value="false"/>
              <property name="importer.tableTypes" value="TABLE,VIEW"/>
              <property name="importer.importKeys" value="false"/>
              <source name="test_dwh_my" translator-name="mylobs" connection-jndi-name="java:/test_dwh_my"/>
          </model>
          <model name="test">
              <property name="importer.useFullSchemaName" value="false"/>
              <property name="importer.tableTypes" value="TABLE,VIEW"/>
              <property name="importer.importKeys" value="false"/>
              <source name="test" translator-name="mylobs" connection-jndi-name="java:/test"/>
          </model>
      
          <translator name="mylobs" type="mysql5">
              <property name="CopyLobs" value="true" />
              <property name="SupportsNativeQueries" value="true"/>
          </translator>
      

      4. Set TRACE level for org.teiid.PLANNER category.

      5. Run the following queries:

      SELECT 1
      from test.test_a a left outer join dwh_my.test_a b on a.A=b.b
      WHERE
           a.a=1 ;;
      
      SELECT 1
      from test.test_a a right outer join dwh_my.test_a b on a.A=b.b
      WHERE
           a.a=1 ;;
      
      SELECT 1
      from test.test_a a full outer join dwh_my.test_a b on a.A=b.b
      WHERE
           a.a=1 ;;
      

      6. Running the first query we get in the end the following query plan:

      OPTIMIZATION COMPLETE:
      PROCESSOR PLAN:
      ProjectNode(0) output=[1] [1]
        JoinNode(1) [NESTED LOOP JOIN] [LEFT OUTER JOIN] criteria=[] output=[]
          AccessNode(2) output=[] SELECT 1 FROM test.test_a AS g_0 WHERE g_0.a = 1
          AccessNode(3) output=[] SELECT 1 FROM dwh_my.test_a AS g_0 WHERE g_0.b = 1
      

      that is the "a.a=1" criteria was pushed down to the first query.

      Running the second query we get in the end the following query plan:

      OPTIMIZATION COMPLETE:
      PROCESSOR PLAN:
      ProjectNode(0) output=[1] [1]
        JoinNode(1) [NESTED LOOP JOIN] [CROSS JOIN] output=[]
          AccessNode(2) output=[] SELECT 1 FROM dwh_my.test_a AS g_0 WHERE g_0.b = 1
          AccessNode(3) output=[] SELECT 1 FROM test.test_a AS g_0 WHERE g_0.a = 1
      

      that is the "a.a=1" criteria was pushed down again to the query (this time the query is the second one).

      Running the third query we get in the end the following query plan:

      OPTIMIZATION COMPLETE:
      PROCESSOR PLAN:
      ProjectNode(0) output=[1] [1]
        SelectNode(1) output=[a.a] a.a = 1
          JoinNode(2) [MERGE JOIN (ALREADY_SORTED/ALREADY_SORTED)] [LEFT OUTER JOIN] criteria=[a.a=b.b] output=[a.a]
            AccessNode(3) output=[a.a] SELECT g_0.a AS c_0 FROM test.test_a AS g_0 ORDER BY c_0
            AccessNode(4) output=[b.b] SELECT g_0.b AS c_0 FROM dwh_my.test_a AS g_0 ORDER BY c_0
      

      here we can see additional select node and the a.a=1 criteria wasn't pushed down at all. The most interesting thing also is that the third query was almost rewritten as the first one (we got the same LEFT OUTER JOIN). So it means that maybe it's possible to change logic a bit for pushing down criteria in some cases also for FULL OUTER JOIN. I debugged engine source code a bit and found out that actually the logic was implemented in RulePushSelectCriteria.examinePath for JOIN type. There were the following comments:
      "Check whether this criteria is on the inner side of an outer join. If so, can't push past the join" and "if we successfully optimized then this should no longer inhibit the criteria from being pushed since the criteria must then be on the outer side of an outer join or on either side of an inner join".

      Show
      1. In MySQL create the test_a table in test_tables and dwh schemas using the following scripts: CREATE TABLE `dwh`.`test_a` ( ` a ` int (11) DEFAULT NULL , `b` int (11) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8; INSERT INTO `dwh`.`test_a` (` a `,`b`) VALUES (1,1); INSERT INTO `dwh`.`test_a` (` a `,`b`) VALUES (1,2); INSERT INTO `dwh`.`test_a` (` a `,`b`) VALUES (2,1); INSERT INTO `dwh`.`test_a` (` a `,`b`) VALUES (2,2); INSERT INTO `dwh`.`test_a` (` a `,`b`) VALUES (3,2); INSERT INTO `dwh`.`test_a` (` a `,`b`) VALUES (3,10); CREATE TABLE `test_tables`.`test_a` ( ` a ` int (11) DEFAULT NULL , `b` int (11) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8; INSERT INTO `test_tables`.`test_a` (` a `,`b`) VALUES (1,1); INSERT INTO `test_tables`.`test_a` (` a `,`b`) VALUES (1,2); INSERT INTO `test_tables`.`test_a` (` a `,`b`) VALUES (2,1); INSERT INTO `test_tables`.`test_a` (` a `,`b`) VALUES (2,2); INSERT INTO `test_tables`.`test_a` (` a `,`b`) VALUES (3,2); INSERT INTO `test_tables`.`test_a` (` a `,`b`) VALUES (3,10); 2. Add MySQL database configuration in standalone-teiid.xml: <datasource jndi-name= "java:/test_dwh_my" pool-name= "test_dwh_my" enabled= "true" use-java-context= "true" > <connection-url> jdbc:mysql://localhost:3306/dwh?zeroDateTimeBehavior=convertToNull </connection-url> <driver> mysql </driver> <new-connection-sql> set SESSION sql_mode = 'ANSI' </new-connection-sql> <pool> <min-pool-size> 2 </min-pool-size> <max-pool-size> 70 </max-pool-size> </pool> <security> <user-name> root </user-name> <password> XXXXXX </password> </security> <validation> <valid-connection-checker class-name= "org.jboss.jca.adapters.jdbc.extensions.mysql.MySQLValidConnectionChecker" /> <exception-sorter class-name= "org.jboss.jca.adapters.jdbc.extensions.mysql.MySQLExceptionSorter" /> </validation> <timeout> <blocking-timeout-millis> 120000 </blocking-timeout-millis> <idle-timeout-minutes> 5 </idle-timeout-minutes> </timeout> </datasource> <datasource jndi-name= "java:/test" pool-name= "test" enabled= "true" use-java-context= "true" > <connection-url> jdbc:mysql://localhost:3306/test_tables?zeroDateTimeBehavior=convertToNull </connection-url> <driver> mysql </driver> <new-connection-sql> set SESSION sql_mode = 'ANSI' </new-connection-sql> <pool> <min-pool-size> 2 </min-pool-size> <max-pool-size> 70 </max-pool-size> </pool> <security> <user-name> root </user-name> <password> XXXXXX </password> </security> <validation> <valid-connection-checker class-name= "org.jboss.jca.adapters.jdbc.extensions.mysql.MySQLValidConnectionChecker" /> <exception-sorter class-name= "org.jboss.jca.adapters.jdbc.extensions.mysql.MySQLExceptionSorter" /> </validation> <timeout> <blocking-timeout-millis> 120000 </blocking-timeout-millis> <idle-timeout-minutes> 5 </idle-timeout-minutes> </timeout> </datasource> 3. Add in test-vdb.xml java:/test_dwh_my and java:/test configured in previous step as datasources: <model name= "dwh_my" > <property name= "importer.useFullSchemaName" value= "false" /> <property name= "importer.tableTypes" value= "TABLE,VIEW" /> <property name= "importer.importKeys" value= "false" /> <source name= "test_dwh_my" translator-name= "mylobs" connection-jndi-name= "java:/test_dwh_my" /> </model> <model name= "test" > <property name= "importer.useFullSchemaName" value= "false" /> <property name= "importer.tableTypes" value= "TABLE,VIEW" /> <property name= "importer.importKeys" value= "false" /> <source name= "test" translator-name= "mylobs" connection-jndi-name= "java:/test" /> </model> <translator name= "mylobs" type= "mysql5" > <property name= "CopyLobs" value= "true" /> <property name= "SupportsNativeQueries" value= "true" /> </translator> 4. Set TRACE level for org.teiid.PLANNER category. 5. Run the following queries: SELECT 1 from test.test_a a left outer join dwh_my.test_a b on a . A =b.b WHERE a . a =1 ;; SELECT 1 from test.test_a a right outer join dwh_my.test_a b on a . A =b.b WHERE a . a =1 ;; SELECT 1 from test.test_a a full outer join dwh_my.test_a b on a . A =b.b WHERE a . a =1 ;; 6. Running the first query we get in the end the following query plan: OPTIMIZATION COMPLETE: PROCESSOR PLAN: ProjectNode(0) output=[1] [1] JoinNode(1) [NESTED LOOP JOIN] [LEFT OUTER JOIN] criteria=[] output=[] AccessNode(2) output=[] SELECT 1 FROM test.test_a AS g_0 WHERE g_0.a = 1 AccessNode(3) output=[] SELECT 1 FROM dwh_my.test_a AS g_0 WHERE g_0.b = 1 that is the "a.a=1" criteria was pushed down to the first query. Running the second query we get in the end the following query plan: OPTIMIZATION COMPLETE: PROCESSOR PLAN: ProjectNode(0) output=[1] [1] JoinNode(1) [NESTED LOOP JOIN] [CROSS JOIN] output=[] AccessNode(2) output=[] SELECT 1 FROM dwh_my.test_a AS g_0 WHERE g_0.b = 1 AccessNode(3) output=[] SELECT 1 FROM test.test_a AS g_0 WHERE g_0.a = 1 that is the "a.a=1" criteria was pushed down again to the query (this time the query is the second one). Running the third query we get in the end the following query plan: OPTIMIZATION COMPLETE: PROCESSOR PLAN: ProjectNode(0) output=[1] [1] SelectNode(1) output=[a.a] a.a = 1 JoinNode(2) [MERGE JOIN (ALREADY_SORTED/ALREADY_SORTED)] [LEFT OUTER JOIN] criteria=[a.a=b.b] output=[a.a] AccessNode(3) output=[a.a] SELECT g_0.a AS c_0 FROM test.test_a AS g_0 ORDER BY c_0 AccessNode(4) output=[b.b] SELECT g_0.b AS c_0 FROM dwh_my.test_a AS g_0 ORDER BY c_0 here we can see additional select node and the a.a=1 criteria wasn't pushed down at all. The most interesting thing also is that the third query was almost rewritten as the first one (we got the same LEFT OUTER JOIN). So it means that maybe it's possible to change logic a bit for pushing down criteria in some cases also for FULL OUTER JOIN. I debugged engine source code a bit and found out that actually the logic was implemented in RulePushSelectCriteria.examinePath for JOIN type. There were the following comments: "Check whether this criteria is on the inner side of an outer join. If so, can't push past the join" and "if we successfully optimized then this should no longer inhibit the criteria from being pushed since the criteria must then be on the outer side of an outer join or on either side of an inner join".

      Criteria Pushdown is not working on FULL OUTER JOIN (according to query plan criteria is not pushed down at all, it's in criteria of LEFT OUTER JOIN node):

      Unable to find source-code formatter for language: noformat. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml
      ----------------------------------------------------------------------------
      OPTIMIZATION COMPLETE:
      PROCESSOR PLAN:
      ProjectNode(0) output=[1] [1]
        SelectNode(1) output=[a.a] a.a = 1
          JoinNode(2) [MERGE JOIN (ALREADY_SORTED/ALREADY_SORTED)] [LEFT OUTER JOIN] criteria=[a.a=b.b] output=[a.a]
            AccessNode(3) output=[a.a] SELECT g_0.a AS c_0 FROM test.test_a AS g_0 ORDER BY c_0
            AccessNode(4) output=[b.b] SELECT g_0.b AS c_0 FROM dwh_my.test_a AS g_0 ORDER BY c_0
      

              rhn-engineering-shawkins Steven Hawkins
              dalex005 Dmitrii Pogorelov (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: