[GitHub] carbondata pull request #1694: [WIP]Added code to support case expression

classic Classic list List threaded Threaded
48 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1694: [WIP]Added code to support case expression

qiuchenjian-2
GitHub user kumarvishal09 opened a pull request:

    https://github.com/apache/carbondata/pull/1694

    [WIP]Added code to support case expression

    Case expression support
   
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
   
     - [ ] Testing done
            Please provide details on
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kumarvishal09/incubator-carbondata Expression_Support

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/1694.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1694
   
----
commit 91536ee8960c1bedf26de30aaf7cf4a1721c077f
Author: kumarvishal <kumarvishal.1802@...>
Date:   2017-12-20T10:16:02Z

    Added code to support case expression

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1694: [WIP]Added code to support case expression

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1694
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2189/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1694: [WIP]Added code to support case expression

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1694
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2461/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1694: [WIP]Added code to support case expression

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1694
 
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/967/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1694: [WIP]Added code to support case expression

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1694
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2190/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1694: [WIP]Added code to support case expression

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1694
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2462/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1694: [WIP]Added code to support case expression

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on the issue:

    https://github.com/apache/carbondata/pull/1694
 
    retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1694: [WIP]Added code to support case expression

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1694
 
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1112/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1694: [WIP]Added code to support case expression

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1694
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2328/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1694: [CARBONDATA-1925][Pre-Aggregate]Added code to suppor...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1694
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2363/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1694: [CARBONDATA-1925][Pre-Aggregate]Added code to suppor...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1694
 
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1147/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1694: [CARBONDATA-1925][Pre-Aggregate]Added code to suppor...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1694
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2575/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1694: [CARBONDATA-1925][Pre-Aggregate]Added code to...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1694#discussion_r158930636
 
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggregateExpressions.scala ---
    @@ -0,0 +1,44 @@
    +package org.apache.carbondata.integration.spark.testsuite.preaggregate
    --- End diff --
   
    add license header


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1694: [CARBONDATA-1925][Pre-Aggregate]Added code to...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1694#discussion_r158930733
 
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggregateExpressions.scala ---
    @@ -0,0 +1,44 @@
    +package org.apache.carbondata.integration.spark.testsuite.preaggregate
    +
    +import org.apache.spark.sql.test.util.QueryTest
    +import org.scalatest.BeforeAndAfterAll
    +
    +class TestPreAggregateExpressions extends QueryTest with BeforeAndAfterAll {
    +
    +  override def beforeAll: Unit = {
    +    sql("drop table if exists mainTable")
    +    sql("CREATE TABLE mainTable(id int, name string, city string, age string) STORED BY 'org.apache.carbondata.format'")
    +    sql("create datamap agg0 on table mainTable using 'preaggregate' as select name,count(age) from mainTable group by name")
    +    sql("create datamap agg1 on table mainTable using 'preaggregate' as select name,sum(case when age=35 then id else 0 end) from mainTable group by name")
    +    sql("create datamap agg2 on table mainTable using 'preaggregate' as select name,sum(case when age=35 then id else 0 end),city from mainTable group by name,city")
    --- End diff --
   
    format the SQL string to make it good nice, they should be capital and separate into different lines correctly


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1694: [CARBONDATA-1925][Pre-Aggregate]Added code to...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1694#discussion_r158930929
 
    --- Diff: integration/spark-common/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchemaCommon.scala ---
    @@ -467,10 +474,13 @@ class TableNewProcessor(cm: TableModel) {
         // Sort columns should be at the begin of all columns
         cm.sortKeyDims.get.foreach { keyDim =>
           val field = cm.dimCols.find(keyDim equals _.column).get
    -      val encoders = if (cm.parentTable.isDefined && cm.dataMapRelation.get.get(field).isDefined) {
    +      val encoders = if (cm.parentTable.isDefined &&
    +                         cm.dataMapRelation.get.get(field).isDefined &&
    +                         cm.dataMapRelation.get.get(field).get.columnTableRelationList.size==1 ) {
    --- End diff --
   
    add space before and after `=`


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1694: [CARBONDATA-1925][Pre-Aggregate]Added code to...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1694#discussion_r158931027
 
    --- Diff: integration/spark-common/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchemaCommon.scala ---
    @@ -492,11 +502,12 @@ class TableNewProcessor(cm: TableModel) {
           val sortField = cm.sortKeyDims.get.find(field.column equals _)
           if (sortField.isEmpty) {
             val encoders = if (cm.parentTable.isDefined &&
    -                           cm.dataMapRelation.get.get(field).isDefined) {
    +                           cm.dataMapRelation.get.get(field).isDefined &&
    +                           cm.dataMapRelation.get.get(field).get.columnTableRelationList.size==1) {
    --- End diff --
   
    add space before and after =


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1694: [CARBONDATA-1925][Pre-Aggregate]Added code to...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1694#discussion_r158931327
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateUtil.scala ---
    @@ -126,19 +126,33 @@ object PreAggregateUtil {
               attr.aggregateFunction,
               parentTableName,
               parentDatabaseName,
    -          parentTableId)
    +          parentTableId,
    +          "column_" + counter)
    +        counter = counter + 1
           case attr: AttributeReference =>
    +        val columnRelation = getColumnRelation(attr.name,
    +          parentTableId,
    +          parentTableName,
    +          parentDatabaseName,
    +          carbonTable)
    +        val arrayBuffer = new ArrayBuffer[ColumnTableRelation]()
    +        arrayBuffer += columnRelation
             fieldToDataMapFieldMap += getField(attr.name,
               attr.dataType,
    -          parentColumnId = carbonTable.getColumnByName(parentTableName, attr.name).getColumnId,
               parentTableName = parentTableName,
    -          parentDatabaseName = parentDatabaseName, parentTableId = parentTableId)
    +          columnTableRelationList = arrayBuffer.toList)
           case Alias(attr: AttributeReference, _) =>
    +        val columnRelation = getColumnRelation(attr.name,
    --- End diff --
   
    move `attr.name` to next line, do the same for all place in this function


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1694: [CARBONDATA-1925][Pre-Aggregate]Added code to...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1694#discussion_r158931409
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateUtil.scala ---
    @@ -147,6 +161,34 @@ object PreAggregateUtil {
         fieldToDataMapFieldMap
       }
     
    +  /**
    +   * Below method will be used to get the column relation
    +   * with the parent column which will be used during query and data loading
    +   * @param parentColumnName
    +   *                         parent column name
    --- End diff --
   
    why not keep in same line?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1694: [CARBONDATA-1925][Pre-Aggregate]Added code to...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1694#discussion_r158931680
 
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggregateExpressions.scala ---
    @@ -0,0 +1,44 @@
    +package org.apache.carbondata.integration.spark.testsuite.preaggregate
    +
    +import org.apache.spark.sql.test.util.QueryTest
    +import org.scalatest.BeforeAndAfterAll
    +
    +class TestPreAggregateExpressions extends QueryTest with BeforeAndAfterAll {
    +
    +  override def beforeAll: Unit = {
    +    sql("drop table if exists mainTable")
    +    sql("CREATE TABLE mainTable(id int, name string, city string, age string) STORED BY 'org.apache.carbondata.format'")
    +    sql("create datamap agg0 on table mainTable using 'preaggregate' as select name,count(age) from mainTable group by name")
    +    sql("create datamap agg1 on table mainTable using 'preaggregate' as select name,sum(case when age=35 then id else 0 end) from mainTable group by name")
    +    sql("create datamap agg2 on table mainTable using 'preaggregate' as select name,sum(case when age=35 then id else 0 end),city from mainTable group by name,city")
    +    sql("create datamap agg3 on table mainTable using 'preaggregate' as select name,sum(case when age=27 then id else 0 end) from mainTable group by name")
    --- End diff --
   
    I think these `CREATE DATAMAP` statement should be run inside the test function, not in beforeAll


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1694: [CARBONDATA-1925][Pre-Aggregate]Added code to...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1694#discussion_r158932043
 
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggregateExpressions.scala ---
    @@ -0,0 +1,44 @@
    +package org.apache.carbondata.integration.spark.testsuite.preaggregate
    +
    +import org.apache.spark.sql.test.util.QueryTest
    +import org.scalatest.BeforeAndAfterAll
    +
    +class TestPreAggregateExpressions extends QueryTest with BeforeAndAfterAll {
    +
    +  override def beforeAll: Unit = {
    +    sql("drop table if exists mainTable")
    +    sql("CREATE TABLE mainTable(id int, name string, city string, age string) STORED BY 'org.apache.carbondata.format'")
    +    sql("create datamap agg0 on table mainTable using 'preaggregate' as select name,count(age) from mainTable group by name")
    +    sql("create datamap agg1 on table mainTable using 'preaggregate' as select name,sum(case when age=35 then id else 0 end) from mainTable group by name")
    +    sql("create datamap agg2 on table mainTable using 'preaggregate' as select name,sum(case when age=35 then id else 0 end),city from mainTable group by name,city")
    +    sql("create datamap agg3 on table mainTable using 'preaggregate' as select name,sum(case when age=27 then id else 0 end) from mainTable group by name")
    +    sql("create datamap agg4 on table mainTable using 'preaggregate' as select name,sum(case when age=27 then id else 0 end), sum(case when age=35 then id else 0 end) from mainTable group by name")
    +    sql(s"LOAD DATA LOCAL INPATH '$resourcesPath/measureinsertintotest.csv' into table mainTable")
    +  }
    +
    +  test("test pre agg create table with expression 1") {
    +    checkExistence(sql("DESCRIBE FORMATTED mainTable_agg0"), true, "maintable_age_count")
    +  }
    +
    +  test("test pre agg create table with expression 2") {
    +    checkExistence(sql("DESCRIBE FORMATTED mainTable_agg1"), true, "maintable_column_0_sum")
    +  }
    +
    +  test("test pre agg create table with expression 3") {
    +    checkExistence(sql("DESCRIBE FORMATTED mainTable_agg2"), true, "maintable_column_0_sum")
    +  }
    +
    +  test("test pre agg create table with expression 4") {
    +    checkExistence(sql("DESCRIBE FORMATTED mainTable_agg3"), true, "maintable_column_0_sum")
    +  }
    +
    +  test("test pre agg create table with expression 5") {
    +    checkExistence(sql("DESCRIBE FORMATTED mainTable_agg4"), true, "maintable_column_0_sum")
    +    checkExistence(sql("DESCRIBE FORMATTED mainTable_agg4"), true, "maintable_column_1_sum")
    +  }
    +
    --- End diff --
   
    Can you add some more test case:
    1. nested case when (case when inside case when)
    2. case when and with filter when creating datamap. Filter column is part of the group by column


---
123