GitHub user kumarvishal09 opened a pull request:
https://github.com/apache/carbondata/pull/1694 [WIP]Added code to support case expression Case expression support - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kumarvishal09/incubator-carbondata Expression_Support Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1694.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1694 ---- commit 91536ee8960c1bedf26de30aaf7cf4a1721c077f Author: kumarvishal <kumarvishal.1802@...> Date: 2017-12-20T10:16:02Z Added code to support case expression ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1694 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2189/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1694 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2461/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1694 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/967/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1694 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2190/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1694 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2462/ --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on the issue:
https://github.com/apache/carbondata/pull/1694 retest this please --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1694 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1112/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1694 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2328/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1694 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2363/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1694 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1147/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1694 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2575/ --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1694#discussion_r158930636 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggregateExpressions.scala --- @@ -0,0 +1,44 @@ +package org.apache.carbondata.integration.spark.testsuite.preaggregate --- End diff -- add license header --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1694#discussion_r158930733 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggregateExpressions.scala --- @@ -0,0 +1,44 @@ +package org.apache.carbondata.integration.spark.testsuite.preaggregate + +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.BeforeAndAfterAll + +class TestPreAggregateExpressions extends QueryTest with BeforeAndAfterAll { + + override def beforeAll: Unit = { + sql("drop table if exists mainTable") + sql("CREATE TABLE mainTable(id int, name string, city string, age string) STORED BY 'org.apache.carbondata.format'") + sql("create datamap agg0 on table mainTable using 'preaggregate' as select name,count(age) from mainTable group by name") + sql("create datamap agg1 on table mainTable using 'preaggregate' as select name,sum(case when age=35 then id else 0 end) from mainTable group by name") + sql("create datamap agg2 on table mainTable using 'preaggregate' as select name,sum(case when age=35 then id else 0 end),city from mainTable group by name,city") --- End diff -- format the SQL string to make it good nice, they should be capital and separate into different lines correctly --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1694#discussion_r158930929 --- Diff: integration/spark-common/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchemaCommon.scala --- @@ -467,10 +474,13 @@ class TableNewProcessor(cm: TableModel) { // Sort columns should be at the begin of all columns cm.sortKeyDims.get.foreach { keyDim => val field = cm.dimCols.find(keyDim equals _.column).get - val encoders = if (cm.parentTable.isDefined && cm.dataMapRelation.get.get(field).isDefined) { + val encoders = if (cm.parentTable.isDefined && + cm.dataMapRelation.get.get(field).isDefined && + cm.dataMapRelation.get.get(field).get.columnTableRelationList.size==1 ) { --- End diff -- add space before and after `=` --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1694#discussion_r158931027 --- Diff: integration/spark-common/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchemaCommon.scala --- @@ -492,11 +502,12 @@ class TableNewProcessor(cm: TableModel) { val sortField = cm.sortKeyDims.get.find(field.column equals _) if (sortField.isEmpty) { val encoders = if (cm.parentTable.isDefined && - cm.dataMapRelation.get.get(field).isDefined) { + cm.dataMapRelation.get.get(field).isDefined && + cm.dataMapRelation.get.get(field).get.columnTableRelationList.size==1) { --- End diff -- add space before and after = --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1694#discussion_r158931327 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateUtil.scala --- @@ -126,19 +126,33 @@ object PreAggregateUtil { attr.aggregateFunction, parentTableName, parentDatabaseName, - parentTableId) + parentTableId, + "column_" + counter) + counter = counter + 1 case attr: AttributeReference => + val columnRelation = getColumnRelation(attr.name, + parentTableId, + parentTableName, + parentDatabaseName, + carbonTable) + val arrayBuffer = new ArrayBuffer[ColumnTableRelation]() + arrayBuffer += columnRelation fieldToDataMapFieldMap += getField(attr.name, attr.dataType, - parentColumnId = carbonTable.getColumnByName(parentTableName, attr.name).getColumnId, parentTableName = parentTableName, - parentDatabaseName = parentDatabaseName, parentTableId = parentTableId) + columnTableRelationList = arrayBuffer.toList) case Alias(attr: AttributeReference, _) => + val columnRelation = getColumnRelation(attr.name, --- End diff -- move `attr.name` to next line, do the same for all place in this function --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1694#discussion_r158931409 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateUtil.scala --- @@ -147,6 +161,34 @@ object PreAggregateUtil { fieldToDataMapFieldMap } + /** + * Below method will be used to get the column relation + * with the parent column which will be used during query and data loading + * @param parentColumnName + * parent column name --- End diff -- why not keep in same line? --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1694#discussion_r158931680 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggregateExpressions.scala --- @@ -0,0 +1,44 @@ +package org.apache.carbondata.integration.spark.testsuite.preaggregate + +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.BeforeAndAfterAll + +class TestPreAggregateExpressions extends QueryTest with BeforeAndAfterAll { + + override def beforeAll: Unit = { + sql("drop table if exists mainTable") + sql("CREATE TABLE mainTable(id int, name string, city string, age string) STORED BY 'org.apache.carbondata.format'") + sql("create datamap agg0 on table mainTable using 'preaggregate' as select name,count(age) from mainTable group by name") + sql("create datamap agg1 on table mainTable using 'preaggregate' as select name,sum(case when age=35 then id else 0 end) from mainTable group by name") + sql("create datamap agg2 on table mainTable using 'preaggregate' as select name,sum(case when age=35 then id else 0 end),city from mainTable group by name,city") + sql("create datamap agg3 on table mainTable using 'preaggregate' as select name,sum(case when age=27 then id else 0 end) from mainTable group by name") --- End diff -- I think these `CREATE DATAMAP` statement should be run inside the test function, not in beforeAll --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1694#discussion_r158932043 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggregateExpressions.scala --- @@ -0,0 +1,44 @@ +package org.apache.carbondata.integration.spark.testsuite.preaggregate + +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.BeforeAndAfterAll + +class TestPreAggregateExpressions extends QueryTest with BeforeAndAfterAll { + + override def beforeAll: Unit = { + sql("drop table if exists mainTable") + sql("CREATE TABLE mainTable(id int, name string, city string, age string) STORED BY 'org.apache.carbondata.format'") + sql("create datamap agg0 on table mainTable using 'preaggregate' as select name,count(age) from mainTable group by name") + sql("create datamap agg1 on table mainTable using 'preaggregate' as select name,sum(case when age=35 then id else 0 end) from mainTable group by name") + sql("create datamap agg2 on table mainTable using 'preaggregate' as select name,sum(case when age=35 then id else 0 end),city from mainTable group by name,city") + sql("create datamap agg3 on table mainTable using 'preaggregate' as select name,sum(case when age=27 then id else 0 end) from mainTable group by name") + sql("create datamap agg4 on table mainTable using 'preaggregate' as select name,sum(case when age=27 then id else 0 end), sum(case when age=35 then id else 0 end) from mainTable group by name") + sql(s"LOAD DATA LOCAL INPATH '$resourcesPath/measureinsertintotest.csv' into table mainTable") + } + + test("test pre agg create table with expression 1") { + checkExistence(sql("DESCRIBE FORMATTED mainTable_agg0"), true, "maintable_age_count") + } + + test("test pre agg create table with expression 2") { + checkExistence(sql("DESCRIBE FORMATTED mainTable_agg1"), true, "maintable_column_0_sum") + } + + test("test pre agg create table with expression 3") { + checkExistence(sql("DESCRIBE FORMATTED mainTable_agg2"), true, "maintable_column_0_sum") + } + + test("test pre agg create table with expression 4") { + checkExistence(sql("DESCRIBE FORMATTED mainTable_agg3"), true, "maintable_column_0_sum") + } + + test("test pre agg create table with expression 5") { + checkExistence(sql("DESCRIBE FORMATTED mainTable_agg4"), true, "maintable_column_0_sum") + checkExistence(sql("DESCRIBE FORMATTED mainTable_agg4"), true, "maintable_column_1_sum") + } + --- End diff -- Can you add some more test case: 1. nested case when (case when inside case when) 2. case when and with filter when creating datamap. Filter column is part of the group by column --- |
Free forum by Nabble | Edit this page |