Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] carbondata pull request #1694: [WIP]Added code to support case expression

Classic

List

48 messages Options

Options

123

[GitHub] carbondata pull request #1694: [WIP]Added code to support case expression

GitHub user kumarvishal09 opened a pull request:

https://github.com/apache/carbondata/pull/1694

[WIP]Added code to support case expression

Case expression support

- [ ] Any interfaces changed?

- [ ] Any backward compatibility impacted?

- [ ] Document update required?

- [ ] Testing done
Please provide details on
- Whether new unit test cases have been added or why no new tests are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance test report.
- Any additional information to help reviewers in testing this change.

- [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kumarvishal09/incubator-carbondata Expression_Support

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1694.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1694

----
commit 91536ee8960c1bedf26de30aaf7cf4a1721c077f
Author: kumarvishal <kumarvishal.1802@...>
Date: 2017-12-20T10:16:02Z

Added code to support case expression

----

---

[GitHub] carbondata issue #1694: [WIP]Added code to support case expression

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1694

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2189/

---

[GitHub] carbondata issue #1694: [WIP]Added code to support case expression

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1694

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2461/

---

[GitHub] carbondata issue #1694: [WIP]Added code to support case expression

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1694

Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/967/

---

[GitHub] carbondata issue #1694: [WIP]Added code to support case expression

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1694

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2190/

---

[GitHub] carbondata issue #1694: [WIP]Added code to support case expression

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1694

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2462/

---

[GitHub] carbondata issue #1694: [WIP]Added code to support case expression

In reply to this post by qiuchenjian-2

Github user kumarvishal09 commented on the issue:

https://github.com/apache/carbondata/pull/1694

retest this please

---

[GitHub] carbondata issue #1694: [WIP]Added code to support case expression

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1694

Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1112/

---

[GitHub] carbondata issue #1694: [WIP]Added code to support case expression

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1694

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2328/

---

[GitHub] carbondata issue #1694: [CARBONDATA-1925][Pre-Aggregate]Added code to suppor...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1694

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2363/

---

[GitHub] carbondata issue #1694: [CARBONDATA-1925][Pre-Aggregate]Added code to suppor...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1694

Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1147/

---

[GitHub] carbondata issue #1694: [CARBONDATA-1925][Pre-Aggregate]Added code to suppor...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1694

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2575/

---

[GitHub] carbondata pull request #1694: [CARBONDATA-1925][Pre-Aggregate]Added code to...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1694#discussion_r158930636

--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggregateExpressions.scala ---
@@ -0,0 +1,44 @@
+package org.apache.carbondata.integration.spark.testsuite.preaggregate
--- End diff --

add license header

---

[GitHub] carbondata pull request #1694: [CARBONDATA-1925][Pre-Aggregate]Added code to...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1694#discussion_r158930733

--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggregateExpressions.scala ---
@@ -0,0 +1,44 @@
+package org.apache.carbondata.integration.spark.testsuite.preaggregate
+
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+class TestPreAggregateExpressions extends QueryTest with BeforeAndAfterAll {
+
+ override def beforeAll: Unit = {
+ sql("drop table if exists mainTable")
+ sql("CREATE TABLE mainTable(id int, name string, city string, age string) STORED BY 'org.apache.carbondata.format'")
+ sql("create datamap agg0 on table mainTable using 'preaggregate' as select name,count(age) from mainTable group by name")
+ sql("create datamap agg1 on table mainTable using 'preaggregate' as select name,sum(case when age=35 then id else 0 end) from mainTable group by name")
+ sql("create datamap agg2 on table mainTable using 'preaggregate' as select name,sum(case when age=35 then id else 0 end),city from mainTable group by name,city")
--- End diff --

format the SQL string to make it good nice, they should be capital and separate into different lines correctly

---

[GitHub] carbondata pull request #1694: [CARBONDATA-1925][Pre-Aggregate]Added code to...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1694#discussion_r158930929

--- Diff: integration/spark-common/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchemaCommon.scala ---
@@ -467,10 +474,13 @@ class TableNewProcessor(cm: TableModel) {
// Sort columns should be at the begin of all columns
cm.sortKeyDims.get.foreach { keyDim =>
val field = cm.dimCols.find(keyDim equals _.column).get
- val encoders = if (cm.parentTable.isDefined && cm.dataMapRelation.get.get(field).isDefined) {
+ val encoders = if (cm.parentTable.isDefined &&
+ cm.dataMapRelation.get.get(field).isDefined &&
+ cm.dataMapRelation.get.get(field).get.columnTableRelationList.size==1 ) {
--- End diff --

add space before and after `=`

---

[GitHub] carbondata pull request #1694: [CARBONDATA-1925][Pre-Aggregate]Added code to...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1694#discussion_r158931027

--- Diff: integration/spark-common/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchemaCommon.scala ---
@@ -492,11 +502,12 @@ class TableNewProcessor(cm: TableModel) {
val sortField = cm.sortKeyDims.get.find(field.column equals _)
if (sortField.isEmpty) {
val encoders = if (cm.parentTable.isDefined &&
- cm.dataMapRelation.get.get(field).isDefined) {
+ cm.dataMapRelation.get.get(field).isDefined &&
+ cm.dataMapRelation.get.get(field).get.columnTableRelationList.size==1) {
--- End diff --

add space before and after =

---

[GitHub] carbondata pull request #1694: [CARBONDATA-1925][Pre-Aggregate]Added code to...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1694#discussion_r158931327

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateUtil.scala ---
@@ -126,19 +126,33 @@ object PreAggregateUtil {
attr.aggregateFunction,
parentTableName,
parentDatabaseName,
- parentTableId)
+ parentTableId,
+ "column_" + counter)
+ counter = counter + 1
case attr: AttributeReference =>
+ val columnRelation = getColumnRelation(attr.name,
+ parentTableId,
+ parentTableName,
+ parentDatabaseName,
+ carbonTable)
+ val arrayBuffer = new ArrayBuffer[ColumnTableRelation]()
+ arrayBuffer += columnRelation
fieldToDataMapFieldMap += getField(attr.name,
attr.dataType,
- parentColumnId = carbonTable.getColumnByName(parentTableName, attr.name).getColumnId,
parentTableName = parentTableName,
- parentDatabaseName = parentDatabaseName, parentTableId = parentTableId)
+ columnTableRelationList = arrayBuffer.toList)
case Alias(attr: AttributeReference, _) =>
+ val columnRelation = getColumnRelation(attr.name,
--- End diff --

move `attr.name` to next line, do the same for all place in this function

---

[GitHub] carbondata pull request #1694: [CARBONDATA-1925][Pre-Aggregate]Added code to...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1694#discussion_r158931409

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateUtil.scala ---
@@ -147,6 +161,34 @@ object PreAggregateUtil {
fieldToDataMapFieldMap
}

+ /**
+ * Below method will be used to get the column relation
+ * with the parent column which will be used during query and data loading
+ * @param parentColumnName
+ * parent column name
--- End diff --

why not keep in same line?

---

[GitHub] carbondata pull request #1694: [CARBONDATA-1925][Pre-Aggregate]Added code to...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1694#discussion_r158931680

--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggregateExpressions.scala ---
@@ -0,0 +1,44 @@
+package org.apache.carbondata.integration.spark.testsuite.preaggregate
+
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+class TestPreAggregateExpressions extends QueryTest with BeforeAndAfterAll {
+
+ override def beforeAll: Unit = {
+ sql("drop table if exists mainTable")
+ sql("CREATE TABLE mainTable(id int, name string, city string, age string) STORED BY 'org.apache.carbondata.format'")
+ sql("create datamap agg0 on table mainTable using 'preaggregate' as select name,count(age) from mainTable group by name")
+ sql("create datamap agg1 on table mainTable using 'preaggregate' as select name,sum(case when age=35 then id else 0 end) from mainTable group by name")
+ sql("create datamap agg2 on table mainTable using 'preaggregate' as select name,sum(case when age=35 then id else 0 end),city from mainTable group by name,city")
+ sql("create datamap agg3 on table mainTable using 'preaggregate' as select name,sum(case when age=27 then id else 0 end) from mainTable group by name")
--- End diff --

I think these `CREATE DATAMAP` statement should be run inside the test function, not in beforeAll

---

[GitHub] carbondata pull request #1694: [CARBONDATA-1925][Pre-Aggregate]Added code to...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1694#discussion_r158932043

--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggregateExpressions.scala ---
@@ -0,0 +1,44 @@
+package org.apache.carbondata.integration.spark.testsuite.preaggregate
+
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+class TestPreAggregateExpressions extends QueryTest with BeforeAndAfterAll {
+
+ override def beforeAll: Unit = {
+ sql("drop table if exists mainTable")
+ sql("CREATE TABLE mainTable(id int, name string, city string, age string) STORED BY 'org.apache.carbondata.format'")
+ sql("create datamap agg0 on table mainTable using 'preaggregate' as select name,count(age) from mainTable group by name")
+ sql("create datamap agg1 on table mainTable using 'preaggregate' as select name,sum(case when age=35 then id else 0 end) from mainTable group by name")
+ sql("create datamap agg2 on table mainTable using 'preaggregate' as select name,sum(case when age=35 then id else 0 end),city from mainTable group by name,city")
+ sql("create datamap agg3 on table mainTable using 'preaggregate' as select name,sum(case when age=27 then id else 0 end) from mainTable group by name")
+ sql("create datamap agg4 on table mainTable using 'preaggregate' as select name,sum(case when age=27 then id else 0 end), sum(case when age=35 then id else 0 end) from mainTable group by name")
+ sql(s"LOAD DATA LOCAL INPATH '$resourcesPath/measureinsertintotest.csv' into table mainTable")
+ }
+
+ test("test pre agg create table with expression 1") {
+ checkExistence(sql("DESCRIBE FORMATTED mainTable_agg0"), true, "maintable_age_count")
+ }
+
+ test("test pre agg create table with expression 2") {
+ checkExistence(sql("DESCRIBE FORMATTED mainTable_agg1"), true, "maintable_column_0_sum")
+ }
+
+ test("test pre agg create table with expression 3") {
+ checkExistence(sql("DESCRIBE FORMATTED mainTable_agg2"), true, "maintable_column_0_sum")
+ }
+
+ test("test pre agg create table with expression 4") {
+ checkExistence(sql("DESCRIBE FORMATTED mainTable_agg3"), true, "maintable_column_0_sum")
+ }
+
+ test("test pre agg create table with expression 5") {
+ checkExistence(sql("DESCRIBE FORMATTED mainTable_agg4"), true, "maintable_column_0_sum")
+ checkExistence(sql("DESCRIBE FORMATTED mainTable_agg4"), true, "maintable_column_1_sum")
+ }
+
--- End diff --

Can you add some more test case:
1. nested case when (case when inside case when)
2. case when and with filter when creating datamap. Filter column is part of the group by column

---

123