Uploaded image for project: 'Teiid'
  1. Teiid
  2. TEIID-5959

Teiid doesn't re-optimize select sub-nodes after adding new criteria

    XMLWordPrintable

Details

    • Quality Risk
    • Resolution: Done
    • Major
    • 15.0
    • 13.1
    • Query Engine
    • None
    • DV Sprint 64
    • 0.5

    Description

      It looks like Teiid doesn't re-optimize union nodes after adding new criteria to their sub-select nodes. For example, for the query (the user has only role1):

      -- QUERY 1
      select * from "test_tables_pg.test_a" where hasRole('report-role')
      union all
      select * from "test_tables_pg2.test_a" where hasRole('role1') ;;
      

      Teiid will generate the following Query plan:

      AccessNode(0) output=[test_tables_pg2.test_a.a, test_tables_pg2.test_a.b] SELECT g_0.a AS c_0, g_0.b AS c_1 FROM test_tables_pg2.test_a AS g_0 LIMIT 100 
      

      what is correct because the user has only role1 and Teiid optimizes union removing the first sub-query as its condition in total is false.

      But for this query:

      -- QUERY 2 (adds a where condition)
      select * from (
      select * from "test_tables_pg.test_a" where hasRole('report-role')
      union all
      select * from "test_tables_pg2.test_a" where hasRole('role1')) a
      where a = 1 ;;
      

      Teiid will propagate the "a = 1" condition in all union sub-queries and generate the following query plan:

      LimitNode(0) output=[test_tables_pg.test_a.a, test_tables_pg.test_a.b] limit 100                                                                                                        
        UnionAllNode(1) output=[test_tables_pg.test_a.a, test_tables_pg.test_a.b]                                                                                                               
          AccessNode(2) output=[test_tables_pg.test_a.a, test_tables_pg.test_a.b] SELECT g_0.a AS c_0, g_0.b AS c_1 FROM test_tables_pg.test_a AS g_0 WHERE (FALSE = TRUE) AND (g_0.a = 1) LIMIT 100                                                                                                                                                                                  
          AccessNode(3) output=[test_tables_pg2.test_a.a, test_tables_pg2.test_a.b] SELECT g_0.a AS c_0, g_0.b AS c_1 FROM test_tables_pg2.test_a AS g_0 WHERE g_0.a = 1 LIMIT 100
      

      what is incorrect as the

      SELECT g_0.a AS c_0, g_0.b AS c_1 FROM test_tables_pg.test_a AS g_0 WHERE (FALSE = TRUE) AND (g_0.a = 1) LIMIT 100
      

      still has false condition and logically the query plan for the second query should be the same as for the first one, though it doesn't have an influence on results.
      I don't know whether it's a bug it not. That's why I created the issue with the "Quality Risk" type.

      Steps to reproduce:
      1. In PostgreSQL create the test_a table using the following scripts:

      CREATE TABLE public.test_a
      (
        a integer,
        b integer
      );
      INSERT INTO public.test_a(a, b) VALUES (1, 1);
      INSERT INTO public.test_a(a, b) VALUES (1, 2);
      INSERT INTO public.test_a(a, b) VALUES (2, 1);
      INSERT INTO public.test_a(a, b) VALUES (2, 2);
      INSERT INTO public.test_a(a, b) VALUES (3, 2);
      INSERT INTO public.test_a(a, b) VALUES (3, 10);
      

      2. Add PostgreSQL database configuration in standalone-teiid.xml:

      <datasource jndi-name="java:/test_tables_pg" pool-name="test_tables_pg" enabled="true" use-java-context="true">
                          <connection-url>jdbc:postgresql://localhost:5432/test?charSet=utf8</connection-url>
                          <driver-class>org.postgresql.Driver</driver-class>
                          <driver>org.postgresql</driver>
                          <pool>
                              <min-pool-size>2</min-pool-size>
                              <max-pool-size>70</max-pool-size>
                              <prefill>false</prefill>
                              <use-strict-min>false</use-strict-min>
                              <flush-strategy>FailingConnectionOnly</flush-strategy>
                          </pool>
                          <security>
                              <user-name>XXXXX</user-name>
                              <password>XXXXX</password>
                          </security>
                          <validation>
                              <check-valid-connection-sql>select 0</check-valid-connection-sql>
                          </validation>
                          <timeout>
                              <blocking-timeout-millis>120000</blocking-timeout-millis>
                              <idle-timeout-minutes>5</idle-timeout-minutes>
                          </timeout>
                      </datasource>
      <datasource jndi-name="java:/test_tables_pg2" pool-name="test_tables_pg2" enabled="true" use-java-context="true">
                          <connection-url>jdbc:postgresql://localhost:5432/test?charSet=utf8</connection-url>
                          <driver-class>org.postgresql.Driver</driver-class>
                          <driver>org.postgresql</driver>
                          <pool>
                              <min-pool-size>2</min-pool-size>
                              <max-pool-size>70</max-pool-size>
                              <prefill>false</prefill>
                              <use-strict-min>false</use-strict-min>
                              <flush-strategy>FailingConnectionOnly</flush-strategy>
                          </pool>
                          <security>
                              <user-name>XXXXX</user-name>
                              <password>XXXXX</password>
                          </security>
                          <validation>
                              <check-valid-connection-sql>select 0</check-valid-connection-sql>
                          </validation>
                          <timeout>
                              <blocking-timeout-millis>120000</blocking-timeout-millis>
                              <idle-timeout-minutes>5</idle-timeout-minutes>
                          </timeout>
                      </datasource>
      

      3. Add in test-vdb.xml java:/test_tables_pg configured in the previous step as a data source:

          <model name="test_tables_pg">
              <property name="importer.useFullSchemaName" value="false"/>
              <property name="importer.tableTypes" value="TABLE,VIEW"/>
              <property name="importer.importKeys" value="false"/>
              <source name="test_tables_pg" translator-name="myPg" connection-jndi-name="java:/test_tables_pg"/>
          </model>
          <model name="test_tables_pg2">
              <property name="importer.useFullSchemaName" value="false"/>
              <property name="importer.tableTypes" value="TABLE,VIEW"/>
              <property name="importer.importKeys" value="false"/>
              <source name="test_tables_pg2" translator-name="myPg" connection-jndi-name="java:/test_tables_pg2"/>
          </model>
      

      4. add the "role1" configuration:

          <data-role name="role1" any-authenticated="true" allow-create-temporary-tables="true">
              <description>Allow read only</description>
              <permission>
                  <resource-name>test_tables_pg2.test_a</resource-name>
                  <allow-read>true</allow-read>
                  <allow-execute>true</allow-execute>
              </permission>
              <permission>
                  <resource-name>test_tables_pg.test_a</resource-name>
                  <allow-read>true</allow-read>
                  <allow-execute>true</allow-execute>
              </permission>
          </data-role>
      

      5. run the following queries and compare their query plans:

      -- QUERY 1
      select * from "test_tables_pg.test_a" where hasRole('report-role')
      union all
      select * from "test_tables_pg2.test_a" where hasRole('role1') ;;
      
      select hasRole('data','report-role') ;;
      
      select hasRole('role1') ;;
      
      -- QUERY 2 (adds a where condition)
      select * from (
      select * from "test_tables_pg.test_a" where hasRole('report-role')
      union all
      select * from "test_tables_pg2.test_a" where hasRole('role1')) a
      where a = 1 ;;
      

      Attachments

        Activity

          People

            rhn-engineering-shawkins Steven Hawkins
            dalex005 Dmitrii Pogorelov
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 4 hours
                4h
                Remaining:
                Remaining Estimate - 0 minutes
                0m
                Logged:
                Time Spent - 4 hours
                4h