[ https://issues.apache.org/jira/browse/CARBONDATA-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramakrishna S updated CARBONDATA-1740: -------------------------------------- Description: 0: jdbc:hive2://10.18.98.48:23040> select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem3 group by l_returnflag, l_linestatus order by l_returnflag, l_linestatus; Error: org.apache.spark.sql.AnalysisException: expression '`lineitem3_l_returnflag`' is neither present in the group by, nor is it an aggregate function. Add to group by or wrap in first() (or first_value) if you don't care which value you get.;; Project [l_returnflag#2356, l_linestatus#2366, sum(l_quantity)#2791, sum(l_extendedprice)#2792] +- Sort [aggOrder#2795 ASC NULLS FIRST, aggOrder#2796 ASC NULLS FIRST], true +- !Aggregate [l_returnflag#2356, l_linestatus#2366], [l_returnflag#2356, l_linestatus#2366, sum(l_quantity#2362) AS sum(l_quantity)#2791, sum(l_extendedprice#2363) AS sum(l_extendedprice)#2792, lineitem3_l_returnflag#2341 AS aggOrder#2795, lineitem3_l_linestatus#2342 AS aggOrder#2796] +- SubqueryAlias lineitem3 +- Relation[L_SHIPDATE#2353,L_SHIPMODE#2354,L_SHIPINSTRUCT#2355,L_RETURNFLAG#2356,L_RECEIPTDATE#2357,L_ORDERKEY#2358,L_PARTKEY#2359,L_SUPPKEY#2360,L_LINENUMBER#2361,L_QUANTITY#2362,L_EXTENDEDPRICE#2363,L_DISCOUNT#2364,L_TAX#2365,L_LINESTATUS#2366,L_COMMITDATE#2367,L_COMMENT#2368] CarbonDatasourceHadoopRelation [ Database name :test_db1, Table name :lineitem3, Schema :Some(StructType(StructField(L_SHIPDATE,StringType,true), StructField(L_SHIPMODE,StringType,true), StructField(L_SHIPINSTRUCT,StringType,true), StructField(L_RETURNFLAG,StringType,true), StructField(L_RECEIPTDATE,StringType,true), StructField(L_ORDERKEY,StringType,true), StructField(L_PARTKEY,StringType,true), StructField(L_SUPPKEY,StringType,true), StructField(L_LINENUMBER,IntegerType,true), StructField(L_QUANTITY,DoubleType,true), StructField(L_EXTENDEDPRICE,DoubleType,true), StructField(L_DISCOUNT,DoubleType,true), StructField(L_TAX,DoubleType,true), StructField(L_LINESTATUS,StringType,true), StructField(L_COMMITDATE,StringType,true), StructField(L_COMMENT,StringType,true))) ] (state=,code=0) 0: jdbc:hive2://10.18.98.48:23040> select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem4 group by l_returnflag, l_linestatus order by l_returnflag, l_linestatus; +---------------+---------------+------------------+------------------------+--+ | l_returnflag | l_linestatus | sum(l_quantity) | sum(l_extendedprice) | +---------------+---------------+------------------+------------------------+--+ | A | F | 1.263625E7 | 1.8938515425239815E10 | | N | F | 327800.0 | 4.913876776200002E8 | | N | O | 2.5398626E7 | 3.810981608977963E10 | | R | F | 1.2643878E7 | 1.8948524305619884E10 | +---------------+---------------+------------------+------------------------+--+ *+Expected:+*: one of these should have been the behavour: 1.Ignore segment filter and use all segments for pre-aggregate load. At the time of query run, if segment filter is set then ignore the pre-aggr table and fetch data from main table. (*Preferred*) Or 2. Reject pre-aggregate creation when segment filter is set or vis-a-versa. *+Actual:+* Partial data returned was: 1. Create a table create table if not exists lineitem2(L_SHIPDATE string,L_SHIPMODE string,L_SHIPINSTRUCT string,L_RETURNFLAG string,L_RECEIPTDATE string,L_ORDERKEY string,L_PARTKEY string,L_SUPPKEY string,L_LINENUMBER int,L_QUANTITY double,L_EXTENDEDPRICE double,L_DISCOUNT double,L_TAX double,L_LINESTATUS string,L_COMMITDATE string,L_COMMENT string) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES ('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'=''); 2. Load 2 times to create 2 segments load data inpath "hdfs://hacluster/user/test/lineitem.tbl.5" into table lineitem2 options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT'); 3. Check the table content without setting any filter: select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem2 group by l_returnflag, l_linestatus; +---------------+---------------+------------------+------------------------+--+ | l_returnflag | l_linestatus | sum(l_quantity) | sum(l_extendedprice) | +---------------+---------------+------------------+------------------------+--+ | N | F | 327800.0 | 4.913876776200004E8 | | A | F | 1.263625E7 | 1.893851542524009E10 | | N | O | 2.5398626E7 | 3.810981608977967E10 | | R | F | 1.2643878E7 | 1.8948524305619976E10 | +---------------+---------------+------------------+------------------------+--+ 4. Set segment filter on the main table: set carbon.input.segments.test_db1.lineitem2=1; +-------------------------------------------+--------+--+ | key | value | +-------------------------------------------+--------+--+ | carbon.input.segments.test_db1.lineitem2 | 1 | +-------------------------------------------+--------+--+ 5. Create pre-aggregate table create datamap agr_lineitem2 ON TABLE lineitem2 USING "org.apache.carbondata.datamap.AggregateDataMapHandler" as select L_RETURNFLAG,L_LINESTATUS,sum(L_QUANTITY),sum(L_EXTENDEDPRICE) from lineitem2 group by L_RETURNFLAG, L_LINESTATUS; 6. Check table content: select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem2 group by l_returnflag, l_linestatus; +---------------+---------------+------------------+------------------------+--+ | l_returnflag | l_linestatus | sum(l_quantity) | sum(l_extendedprice) | +---------------+---------------+------------------+------------------------+--+ | N | F | 163900.0 | 2.4569383881000024E8 | | A | F | 6318125.0 | 9.469257712620043E9 | | N | O | 1.2699313E7 | 1.9054908044889835E10 | | R | F | 6321939.0 | 9.474262152809986E9 | +---------------+---------------+------------------+------------------------+--+ 7. remove the filter on segment 0: jdbc:hive2://10.18.98.48:23040> reset; 8. Check the table conent: select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem2 group by l_returnflag, l_linestatus; +---------------+---------------+------------------+------------------------+--+ | l_returnflag | l_linestatus | sum(l_quantity) | sum(l_extendedprice) | +---------------+---------------+------------------+------------------------+--+ | N | F | 163900.0 | 2.4569383881000024E8 | | A | F | 6318125.0 | 9.469257712620043E9 | | N | O | 1.2699313E7 | 1.9054908044889835E10 | | R | F | 6321939.0 | 9.474262152809986E9 | +---------------+---------------+------------------+------------------------+--+ 4 rows selected (2.341 seconds) 9. Load one more time: 10. Check table content select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem2 group by l_returnflag, l_linestatus; +---------------+---------------+------------------+------------------------+--+ | l_returnflag | l_linestatus | sum(l_quantity) | sum(l_extendedprice) | +---------------+---------------+------------------+------------------------+--+ | N | F | 327800.0 | 4.913876776200005E8 | | A | F | 1.263625E7 | 1.8938515425240086E10 | | N | O | 2.5398626E7 | 3.810981608977967E10 | | R | F | 1.2643878E7 | 1.8948524305619972E10 | +---------------+---------------+------------------+------------------------+--+ 4 rows selected (0.936 seconds) *+Expected:+*: one of these should have been the behavour: 1.Ignore segment filter and use all segments for pre-aggregate load. At the time of query run, if segment filter is set then ignore the pre-aggr table and fetch data from main table. (*Preferred*) Or 2. Reject pre-aggregate creation when segment filter is set or vis-a-versa. *+Actual:+* Partial data returned > Carbon1.3.0-Pre-AggregateTable - Aggregate query with order by fails when main table is having pre-aggregate table > ------------------------------------------------------------------------------------------------------------------ > > Key: CARBONDATA-1740 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1740 > Project: CarbonData > Issue Type: Bug > Components: data-load > Affects Versions: 1.3.0 > Environment: Test - 3 node ant cluster > Reporter: Ramakrishna S > Labels: DFX > Fix For: 1.3.0 > > > 0: jdbc:hive2://10.18.98.48:23040> select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem3 group by l_returnflag, l_linestatus order by l_returnflag, l_linestatus; > Error: org.apache.spark.sql.AnalysisException: expression '`lineitem3_l_returnflag`' is neither present in the group by, nor is it an aggregate function. Add to group by or wrap in first() (or first_value) if you don't care which value you get.;; > Project [l_returnflag#2356, l_linestatus#2366, sum(l_quantity)#2791, sum(l_extendedprice)#2792] > +- Sort [aggOrder#2795 ASC NULLS FIRST, aggOrder#2796 ASC NULLS FIRST], true > +- !Aggregate [l_returnflag#2356, l_linestatus#2366], [l_returnflag#2356, l_linestatus#2366, sum(l_quantity#2362) AS sum(l_quantity)#2791, sum(l_extendedprice#2363) AS sum(l_extendedprice)#2792, lineitem3_l_returnflag#2341 AS aggOrder#2795, lineitem3_l_linestatus#2342 AS aggOrder#2796] > +- SubqueryAlias lineitem3 > +- Relation[L_SHIPDATE#2353,L_SHIPMODE#2354,L_SHIPINSTRUCT#2355,L_RETURNFLAG#2356,L_RECEIPTDATE#2357,L_ORDERKEY#2358,L_PARTKEY#2359,L_SUPPKEY#2360,L_LINENUMBER#2361,L_QUANTITY#2362,L_EXTENDEDPRICE#2363,L_DISCOUNT#2364,L_TAX#2365,L_LINESTATUS#2366,L_COMMITDATE#2367,L_COMMENT#2368] CarbonDatasourceHadoopRelation [ Database name :test_db1, Table name :lineitem3, Schema :Some(StructType(StructField(L_SHIPDATE,StringType,true), StructField(L_SHIPMODE,StringType,true), StructField(L_SHIPINSTRUCT,StringType,true), StructField(L_RETURNFLAG,StringType,true), StructField(L_RECEIPTDATE,StringType,true), StructField(L_ORDERKEY,StringType,true), StructField(L_PARTKEY,StringType,true), StructField(L_SUPPKEY,StringType,true), StructField(L_LINENUMBER,IntegerType,true), StructField(L_QUANTITY,DoubleType,true), StructField(L_EXTENDEDPRICE,DoubleType,true), StructField(L_DISCOUNT,DoubleType,true), StructField(L_TAX,DoubleType,true), StructField(L_LINESTATUS,StringType,true), StructField(L_COMMITDATE,StringType,true), StructField(L_COMMENT,StringType,true))) ] (state=,code=0) > 0: jdbc:hive2://10.18.98.48:23040> select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem4 group by l_returnflag, l_linestatus order by l_returnflag, l_linestatus; > +---------------+---------------+------------------+------------------------+--+ > | l_returnflag | l_linestatus | sum(l_quantity) | sum(l_extendedprice) | > +---------------+---------------+------------------+------------------------+--+ > | A | F | 1.263625E7 | 1.8938515425239815E10 | > | N | F | 327800.0 | 4.913876776200002E8 | > | N | O | 2.5398626E7 | 3.810981608977963E10 | > | R | F | 1.2643878E7 | 1.8948524305619884E10 | > +---------------+---------------+------------------+------------------------+--+ > *+Expected:+*: one of these should have been the behavour: > 1.Ignore segment filter and use all segments for pre-aggregate load. At the time of query run, if segment filter is set then ignore the pre-aggr table and fetch data from main table. (*Preferred*) > Or > 2. Reject pre-aggregate creation when segment filter is set or vis-a-versa. > *+Actual:+* Partial data returned -- This message was sent by Atlassian JIRA (v6.4.14#64029) |
Free forum by Nabble | Edit this page |