Uploaded image for project: 'Teiid'
  1. Teiid
  2. TEIID-5959

Teiid doesn't re-optimize select sub-nodes after adding new criteria

    Details

    • Type: Quality Risk
    • Status: Resolved (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: 13.1
    • Fix Version/s: 15.0
    • Component/s: Query Engine
    • Labels:
      None
    • Sprint:
      DV Sprint 64
    • Story Points:
      0.5

      Description

      It looks like Teiid doesn't re-optimize union nodes after adding new criteria to their sub-select nodes. For example, for the query (the user has only role1):

      -- QUERY 1
      select * from "test_tables_pg.test_a" where hasRole('report-role')
      union all
      select * from "test_tables_pg2.test_a" where hasRole('role1') ;;
      

      Teiid will generate the following Query plan:

      AccessNode(0) output=[test_tables_pg2.test_a.a, test_tables_pg2.test_a.b] SELECT g_0.a AS c_0, g_0.b AS c_1 FROM test_tables_pg2.test_a AS g_0 LIMIT 100 
      

      what is correct because the user has only role1 and Teiid optimizes union removing the first sub-query as its condition in total is false.

      But for this query:

      -- QUERY 2 (adds a where condition)
      select * from (
      select * from "test_tables_pg.test_a" where hasRole('report-role')
      union all
      select * from "test_tables_pg2.test_a" where hasRole('role1')) a
      where a = 1 ;;
      

      Teiid will propagate the "a = 1" condition in all union sub-queries and generate the following query plan:

      LimitNode(0) output=[test_tables_pg.test_a.a, test_tables_pg.test_a.b] limit 100                                                                                                        
        UnionAllNode(1) output=[test_tables_pg.test_a.a, test_tables_pg.test_a.b]                                                                                                               
          AccessNode(2) output=[test_tables_pg.test_a.a, test_tables_pg.test_a.b] SELECT g_0.a AS c_0, g_0.b AS c_1 FROM test_tables_pg.test_a AS g_0 WHERE (FALSE = TRUE) AND (g_0.a = 1) LIMIT 100                                                                                                                                                                                  
          AccessNode(3) output=[test_tables_pg2.test_a.a, test_tables_pg2.test_a.b] SELECT g_0.a AS c_0, g_0.b AS c_1 FROM test_tables_pg2.test_a AS g_0 WHERE g_0.a = 1 LIMIT 100
      

      what is incorrect as the

      SELECT g_0.a AS c_0, g_0.b AS c_1 FROM test_tables_pg.test_a AS g_0 WHERE (FALSE = TRUE) AND (g_0.a = 1) LIMIT 100
      

      still has false condition and logically the query plan for the second query should be the same as for the first one, though it doesn't have an influence on results.
      I don't know whether it's a bug it not. That's why I created the issue with the "Quality Risk" type.

      Steps to reproduce:
      1. In PostgreSQL create the test_a table using the following scripts:

      CREATE TABLE public.test_a
      (
        a integer,
        b integer
      );
      INSERT INTO public.test_a(a, b) VALUES (1, 1);
      INSERT INTO public.test_a(a, b) VALUES (1, 2);
      INSERT INTO public.test_a(a, b) VALUES (2, 1);
      INSERT INTO public.test_a(a, b) VALUES (2, 2);
      INSERT INTO public.test_a(a, b) VALUES (3, 2);
      INSERT INTO public.test_a(a, b) VALUES (3, 10);
      

      2. Add PostgreSQL database configuration in standalone-teiid.xml:

      <datasource jndi-name="java:/test_tables_pg" pool-name="test_tables_pg" enabled="true" use-java-context="true">
                          <connection-url>jdbc:postgresql://localhost:5432/test?charSet=utf8</connection-url>
                          <driver-class>org.postgresql.Driver</driver-class>
                          <driver>org.postgresql</driver>
                          <pool>
                              <min-pool-size>2</min-pool-size>
                              <max-pool-size>70</max-pool-size>
                              <prefill>false</prefill>
                              <use-strict-min>false</use-strict-min>
                              <flush-strategy>FailingConnectionOnly</flush-strategy>
                          </pool>
                          <security>
                              <user-name>XXXXX</user-name>
                              <password>XXXXX</password>
                          </security>
                          <validation>
                              <check-valid-connection-sql>select 0</check-valid-connection-sql>
                          </validation>
                          <timeout>
                              <blocking-timeout-millis>120000</blocking-timeout-millis>
                              <idle-timeout-minutes>5</idle-timeout-minutes>
                          </timeout>
                      </datasource>
      <datasource jndi-name="java:/test_tables_pg2" pool-name="test_tables_pg2" enabled="true" use-java-context="true">
                          <connection-url>jdbc:postgresql://localhost:5432/test?charSet=utf8</connection-url>
                          <driver-class>org.postgresql.Driver</driver-class>
                          <driver>org.postgresql</driver>
                          <pool>
                              <min-pool-size>2</min-pool-size>
                              <max-pool-size>70</max-pool-size>
                              <prefill>false</prefill>
                              <use-strict-min>false</use-strict-min>
                              <flush-strategy>FailingConnectionOnly</flush-strategy>
                          </pool>
                          <security>
                              <user-name>XXXXX</user-name>
                              <password>XXXXX</password>
                          </security>
                          <validation>
                              <check-valid-connection-sql>select 0</check-valid-connection-sql>
                          </validation>
                          <timeout>
                              <blocking-timeout-millis>120000</blocking-timeout-millis>
                              <idle-timeout-minutes>5</idle-timeout-minutes>
                          </timeout>
                      </datasource>
      

      3. Add in test-vdb.xml java:/test_tables_pg configured in the previous step as a data source:

          <model name="test_tables_pg">
              <property name="importer.useFullSchemaName" value="false"/>
              <property name="importer.tableTypes" value="TABLE,VIEW"/>
              <property name="importer.importKeys" value="false"/>
              <source name="test_tables_pg" translator-name="myPg" connection-jndi-name="java:/test_tables_pg"/>
          </model>
          <model name="test_tables_pg2">
              <property name="importer.useFullSchemaName" value="false"/>
              <property name="importer.tableTypes" value="TABLE,VIEW"/>
              <property name="importer.importKeys" value="false"/>
              <source name="test_tables_pg2" translator-name="myPg" connection-jndi-name="java:/test_tables_pg2"/>
          </model>
      

      4. add the "role1" configuration:

          <data-role name="role1" any-authenticated="true" allow-create-temporary-tables="true">
              <description>Allow read only</description>
              <permission>
                  <resource-name>test_tables_pg2.test_a</resource-name>
                  <allow-read>true</allow-read>
                  <allow-execute>true</allow-execute>
              </permission>
              <permission>
                  <resource-name>test_tables_pg.test_a</resource-name>
                  <allow-read>true</allow-read>
                  <allow-execute>true</allow-execute>
              </permission>
          </data-role>
      

      5. run the following queries and compare their query plans:

      -- QUERY 1
      select * from "test_tables_pg.test_a" where hasRole('report-role')
      union all
      select * from "test_tables_pg2.test_a" where hasRole('role1') ;;
      
      select hasRole('data','report-role') ;;
      
      select hasRole('role1') ;;
      
      -- QUERY 2 (adds a where condition)
      select * from (
      select * from "test_tables_pg.test_a" where hasRole('report-role')
      union all
      select * from "test_tables_pg2.test_a" where hasRole('role1')) a
      where a = 1 ;;
      

        Gliffy Diagrams

          Attachments

            Activity

              People

              • Assignee:
                shawkins Steven Hawkins
                Reporter:
                dalex005 Dmitrii Pogorelov
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 4 hours
                  4h
                  Remaining:
                  Remaining Estimate - 0 minutes
                  0m
                  Logged:
                  Time Spent - 4 hours
                  4h