Uploaded image for project: 'Teiid'
  1. Teiid
  2. TEIID-3854

Impala translator - ORDER BY clause using incorrect columns when not in SELECT list

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Critical Critical
    • 8.12.3, 8.13
    • 8.12.1, 8.12.2
    • JDBC Connector
    • None
    • Hide

      Execute teiid on Cloudera Impala over CDH 5.3.8:

      SELECT process_date_key,sum(num_clicks) as num_clicks,sum(num_clicks_comp) as num_clicks_comp,process_month_name,process_day_name FROM ( SELECT process_date_key,sum(num_clicks) as num_clicks,0 as num_clicks_comp,process_month_name,process_day_name,process_month,process_day_num FROM ActivityPublisherProcessDate WHERE publisher_key=112 AND process_date_key>='2014-05-25' AND process_date_key<='2014-06-05' GROUP BY process_date_key,process_month_name,process_day_name,process_month,process_day_num UNION ALL SELECT process_date_key,0 as num_clicks,sum(num_clicks) as num_clicks_comp,process_month_name,process_day_name,process_month,process_day_num FROM ActivityPublisherProcessDate WHERE publisher_key=112 AND process_date_key>='2015-06-26' AND process_date_key<='2015-07-05' GROUP BY process_date_key,process_month_name,process_day_name,process_month,process_day_num ) inr GROUP BY process_date_key,process_month_name,process_day_name,process_month,process_day_num ORDER BY process_month ASC,process_day_num ASC LIMIT 25;

      The query that is issued to Impala:

      SELECT v_0.c_0, SUM(v_0.c_5) AS c_1, SUM(v_0.c_6) AS c_2, v_0.c_1 AS c_3, v_0.c_2 AS c_4 FROM (SELECT c_0, c_1, c_2, c_3, c_4, c_5, c_6 FROM ( SELECT g_2.process_date_key AS c_0, g_3.month_name AS c_1, g_3.day_name AS c_2, g_3.month_num AS c_3, g_3.day_num AS c_4, SUM(g_2.num_clicks) AS c_5, 0 AS c_6 FROM mart.fact_activity_publisher_process_date g_2 JOIN shared.dim_date g_3 ON g_2.process_date_key = g_3.date_key WHERE g_2.publisher_key = 112 AND g_2.process_date_key >= '2014-05-25' AND g_2.process_date_key <= '2014-06-05' AND g_3.date_key >= '2014-05-25' AND g_3.date_key <= '2014-06-05' GROUP BY g_2.process_date_key, g_3.month_name, g_3.day_name, g_3.month_num, g_3.day_num UNION ALL SELECT g_0.process_date_key AS c_0, g_1.month_name AS c_1, g_1.day_name AS c_2, g_1.month_num AS c_3, g_1.day_num AS c_4, 0 AS c_5, SUM(g_0.num_clicks) AS c_6 FROM mart.fact_activity_publisher_process_date g_0 JOIN shared.dim_date g_1 ON g_0.process_date_key = g_1.date_key WHERE g_0.publisher_key = 112 AND g_0.process_date_key >= '2015-06-26' AND g_0.process_date_key <= '2015-07-05' AND g_1.date_key >= '2015-06-26' AND g_1.date_key <= '2015-07-05' GROUP BY g_0.process_date_key, g_1.month_name, g_1.day_name, g_1.month_num, g_1.day_num) X__) v_0 GROUP BY v_0.c_0, v_0.c_1, v_0.c_2, v_0.c_3, v_0.c_4 ORDER BY c_3, c_4 LIMIT 25

      Note the c_3 and c_4 were re-aliased in the select to the month and day name respectively by the translator.

      Show
      Execute teiid on Cloudera Impala over CDH 5.3.8: SELECT process_date_key,sum(num_clicks) as num_clicks,sum(num_clicks_comp) as num_clicks_comp,process_month_name,process_day_name FROM ( SELECT process_date_key,sum(num_clicks) as num_clicks,0 as num_clicks_comp,process_month_name,process_day_name,process_month,process_day_num FROM ActivityPublisherProcessDate WHERE publisher_key=112 AND process_date_key>='2014-05-25' AND process_date_key<='2014-06-05' GROUP BY process_date_key,process_month_name,process_day_name,process_month,process_day_num UNION ALL SELECT process_date_key,0 as num_clicks,sum(num_clicks) as num_clicks_comp,process_month_name,process_day_name,process_month,process_day_num FROM ActivityPublisherProcessDate WHERE publisher_key=112 AND process_date_key>='2015-06-26' AND process_date_key<='2015-07-05' GROUP BY process_date_key,process_month_name,process_day_name,process_month,process_day_num ) inr GROUP BY process_date_key,process_month_name,process_day_name,process_month,process_day_num ORDER BY process_month ASC,process_day_num ASC LIMIT 25; The query that is issued to Impala: SELECT v_0.c_0, SUM(v_0.c_5) AS c_1, SUM(v_0.c_6) AS c_2, v_0.c_1 AS c_3, v_0.c_2 AS c_4 FROM (SELECT c_0, c_1, c_2, c_3, c_4, c_5, c_6 FROM ( SELECT g_2.process_date_key AS c_0, g_3.month_name AS c_1, g_3.day_name AS c_2, g_3.month_num AS c_3, g_3.day_num AS c_4, SUM(g_2.num_clicks) AS c_5, 0 AS c_6 FROM mart.fact_activity_publisher_process_date g_2 JOIN shared.dim_date g_3 ON g_2.process_date_key = g_3.date_key WHERE g_2.publisher_key = 112 AND g_2.process_date_key >= '2014-05-25' AND g_2.process_date_key <= '2014-06-05' AND g_3.date_key >= '2014-05-25' AND g_3.date_key <= '2014-06-05' GROUP BY g_2.process_date_key, g_3.month_name, g_3.day_name, g_3.month_num, g_3.day_num UNION ALL SELECT g_0.process_date_key AS c_0, g_1.month_name AS c_1, g_1.day_name AS c_2, g_1.month_num AS c_3, g_1.day_num AS c_4, 0 AS c_5, SUM(g_0.num_clicks) AS c_6 FROM mart.fact_activity_publisher_process_date g_0 JOIN shared.dim_date g_1 ON g_0.process_date_key = g_1.date_key WHERE g_0.publisher_key = 112 AND g_0.process_date_key >= '2015-06-26' AND g_0.process_date_key <= '2015-07-05' AND g_1.date_key >= '2015-06-26' AND g_1.date_key <= '2015-07-05' GROUP BY g_0.process_date_key, g_1.month_name, g_1.day_name, g_1.month_num, g_1.day_num) X__) v_0 GROUP BY v_0.c_0, v_0.c_1, v_0.c_2, v_0.c_3, v_0.c_4 ORDER BY c_3, c_4 LIMIT 25 Note the c_3 and c_4 were re-aliased in the select to the month and day name respectively by the translator.

      We have a scenario in which users want to see month and day names but the data should be sorted by month and day NUMBER. The translated query comes back sorted the wrong column. Oddly in the ORDER clause the translator uses the month name column alias when we specify the month number, day name when we specify day number.

              rhn-engineering-shawkins Steven Hawkins
              dkrapohl@hotmail.com Donald Krapohl (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: