[GitHub] carbondata pull request #2620: [CARBONDATA-2839] Add custom compaction examp...

classic Classic list List threaded Threaded
29 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2620: [CARBONDATA-2839] Add custom compaction examp...

qiuchenjian-2
GitHub user Xaprice opened a pull request:

    https://github.com/apache/carbondata/pull/2620

    [CARBONDATA-2839] Add custom compaction example

    Be sure to do all of the following checklist to help us incorporate
    your contribution quickly and easily:
   
     - [ x ] Any interfaces changed?
     no
     - [ x ] Any backward compatibility impacted?
     no
     - [ x ] Document update required?
    no
     - [ x ] Testing done
            Please provide details on
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ x ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
    small change


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/Xaprice/carbondata custom_compaction_example

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2620.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2620
   
----
commit 0402a5f1f66027e5b7d72b514eb80b09c2d7222e
Author: Jin Zhou <xaprice@...>
Date:   2018-08-08T09:01:43Z

    [CARBONDATA-2839] Add custom compaction example

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2620: [CARBONDATA-2839] Add custom compaction examp...

qiuchenjian-2
Github user zzcclp commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2620#discussion_r208523007
 
    --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/CustomCompactionExample.scala ---
    @@ -0,0 +1,69 @@
    +package org.apache.carbondata.examples
    +
    +import java.io.File
    +
    +import org.apache.spark.sql.SparkSession
    +
    +import org.apache.carbondata.core.constants.CarbonCommonConstants
    +import org.apache.carbondata.core.util.CarbonProperties
    +import org.apache.carbondata.examples.util.ExampleUtils
    +
    +
    +object CustomCompactionExample {
    +
    +  def main(args: Array[String]): Unit = {
    +    val spark = ExampleUtils.createCarbonSession("CustomCompactionExample")
    +    exampleBody(spark)
    +    spark.close()
    +  }
    +
    +  def exampleBody(spark : SparkSession): Unit = {
    +    CarbonProperties.getInstance()
    +      .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "yyyy/MM/dd")
    +
    +    spark.sql("DROP TABLE IF EXISTS custom_compaction_table")
    +
    +    spark.sql(
    +      s"""
    +         | CREATE TABLE IF NOT EXISTS custom_compaction_table(
    +         | ID Int,
    +         | date Date,
    +         | country String,
    +         | name String,
    +         | phonetype String,
    +         | serialname String,
    +         | salary Int,
    +         | floatField float
    +         | ) STORED BY 'carbondata'
    +       """.stripMargin)
    +
    +    val rootPath = new File(this.getClass.getResource("/").getPath
    +      + "../../../..").getCanonicalPath
    +    val path = s"$rootPath/examples/spark2/src/main/resources/dataSample.csv"
    +
    +    // load 4 segments
    +    // scalastyle:off
    +    (1 to 4).foreach(_ => spark.sql(
    +      s"""
    +         | LOAD DATA LOCAL INPATH '$path'
    +         | INTO TABLE custom_compaction_table
    +         | OPTIONS('HEADER'='true')
    +       """.stripMargin))
    +    // scalastyle:on
    +
    +    // show all segments: 0,1,2,3
    +    spark.sql("SHOW SEGMENTS FOR TABLE custom_compaction_table").show()
    +
    +    // do custom compaction, segments specified will be merged
    +    spark.sql("ALTER TABLE custom_compaction_table COMPACT 'CUSTOM' WHERE SEGMENT.ID IN (1,2)")
    +    spark.sql("SHOW SEGMENTS FOR TABLE custom_compaction_table").show()
    +
    +    CarbonProperties.getInstance().addProperty(
    --- End diff --
   
    why set this property here?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2620: [CARBONDATA-2839] Add custom compaction example

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2620
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6210/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2620: [CARBONDATA-2839] Add custom compaction example

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2620
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7836/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2620: [CARBONDATA-2839] Add custom compaction example

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2620
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6561/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2620: [CARBONDATA-2839] Add custom compaction example

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:

    https://github.com/apache/carbondata/pull/2620
 
    retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2620: [CARBONDATA-2839] Add custom compaction examp...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2620#discussion_r208780123
 
    --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/CustomCompactionExample.scala ---
    @@ -0,0 +1,69 @@
    +package org.apache.carbondata.examples
    --- End diff --
   
    please add the apache license header


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2620: [CARBONDATA-2839] Add custom compaction examp...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2620#discussion_r208780252
 
    --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/CustomCompactionExample.scala ---
    @@ -0,0 +1,69 @@
    +package org.apache.carbondata.examples
    +
    +import java.io.File
    +
    +import org.apache.spark.sql.SparkSession
    +
    +import org.apache.carbondata.core.constants.CarbonCommonConstants
    +import org.apache.carbondata.core.util.CarbonProperties
    +import org.apache.carbondata.examples.util.ExampleUtils
    +
    +
    --- End diff --
   
    please add the description for explaining the example.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2620: [CARBONDATA-2839] Add custom compaction examp...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2620#discussion_r208780697
 
    --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/CustomCompactionExample.scala ---
    @@ -0,0 +1,69 @@
    +package org.apache.carbondata.examples
    +
    +import java.io.File
    +
    +import org.apache.spark.sql.SparkSession
    +
    +import org.apache.carbondata.core.constants.CarbonCommonConstants
    +import org.apache.carbondata.core.util.CarbonProperties
    +import org.apache.carbondata.examples.util.ExampleUtils
    +
    +
    +object CustomCompactionExample {
    +
    +  def main(args: Array[String]): Unit = {
    +    val spark = ExampleUtils.createCarbonSession("CustomCompactionExample")
    +    exampleBody(spark)
    +    spark.close()
    +  }
    +
    +  def exampleBody(spark : SparkSession): Unit = {
    +    CarbonProperties.getInstance()
    +      .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "yyyy/MM/dd")
    +
    +    spark.sql("DROP TABLE IF EXISTS custom_compaction_table")
    +
    +    spark.sql(
    +      s"""
    +         | CREATE TABLE IF NOT EXISTS custom_compaction_table(
    +         | ID Int,
    +         | date Date,
    +         | country String,
    +         | name String,
    +         | phonetype String,
    +         | serialname String,
    +         | salary Int,
    +         | floatField float
    +         | ) STORED BY 'carbondata'
    +       """.stripMargin)
    +
    +    val rootPath = new File(this.getClass.getResource("/").getPath
    +      + "../../../..").getCanonicalPath
    +    val path = s"$rootPath/examples/spark2/src/main/resources/dataSample.csv"
    +
    +    // load 4 segments
    +    // scalastyle:off
    +    (1 to 4).foreach(_ => spark.sql(
    +      s"""
    +         | LOAD DATA LOCAL INPATH '$path'
    +         | INTO TABLE custom_compaction_table
    +         | OPTIONS('HEADER'='true')
    +       """.stripMargin))
    +    // scalastyle:on
    +
    +    // show all segments: 0,1,2,3
    +    spark.sql("SHOW SEGMENTS FOR TABLE custom_compaction_table").show()
    +
    +    // do custom compaction, segments specified will be merged
    +    spark.sql("ALTER TABLE custom_compaction_table COMPACT 'CUSTOM' WHERE SEGMENT.ID IN (1,2)")
    +    spark.sql("SHOW SEGMENTS FOR TABLE custom_compaction_table").show()
    +
    +    CarbonProperties.getInstance().addProperty(
    +      CarbonCommonConstants.CARBON_DATE_FORMAT,
    +      CarbonCommonConstants.CARBON_DATE_DEFAULT_FORMAT)
    +
    --- End diff --
   
    After custom compaction, please query table data once to check the data if it is correct?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2620: [CARBONDATA-2839] Add custom compaction example

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2620
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7846/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2620: [CARBONDATA-2839] Add custom compaction example

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2620
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6571/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2620: [CARBONDATA-2839] Add custom compaction example

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/2620
 
    What does ‘CUSTOM COMPACTION’ mean here?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2620: [CARBONDATA-2839] Add custom compaction example

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user zzcclp commented on the issue:

    https://github.com/apache/carbondata/pull/2620
 
    @xuchuanyin it means that it can compact specified segments which are assigned by users:
    `ALTER TABLE custom_compaction_table COMPACT 'CUSTOM' WHERE SEGMENT.ID IN (1,2)`


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2620: [CARBONDATA-2839] Add custom compaction example

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user Xaprice commented on the issue:

    https://github.com/apache/carbondata/pull/2620
 
    @xuchuanyin, ‘CUSTOM COMPACTION’ is a new compaction type in addition to MAJOR and MINOR COMPACTION. When doing custom compaction, user can directly specify segment ids to be merged.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2620: [CARBONDATA-2839] Add custom compaction examp...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user Xaprice commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2620#discussion_r208840858
 
    --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/CustomCompactionExample.scala ---
    @@ -0,0 +1,69 @@
    +package org.apache.carbondata.examples
    +
    +import java.io.File
    +
    +import org.apache.spark.sql.SparkSession
    +
    +import org.apache.carbondata.core.constants.CarbonCommonConstants
    +import org.apache.carbondata.core.util.CarbonProperties
    +import org.apache.carbondata.examples.util.ExampleUtils
    +
    +
    +object CustomCompactionExample {
    +
    +  def main(args: Array[String]): Unit = {
    +    val spark = ExampleUtils.createCarbonSession("CustomCompactionExample")
    +    exampleBody(spark)
    +    spark.close()
    +  }
    +
    +  def exampleBody(spark : SparkSession): Unit = {
    +    CarbonProperties.getInstance()
    +      .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "yyyy/MM/dd")
    +
    +    spark.sql("DROP TABLE IF EXISTS custom_compaction_table")
    +
    +    spark.sql(
    +      s"""
    +         | CREATE TABLE IF NOT EXISTS custom_compaction_table(
    +         | ID Int,
    +         | date Date,
    +         | country String,
    +         | name String,
    +         | phonetype String,
    +         | serialname String,
    +         | salary Int,
    +         | floatField float
    +         | ) STORED BY 'carbondata'
    +       """.stripMargin)
    +
    +    val rootPath = new File(this.getClass.getResource("/").getPath
    +      + "../../../..").getCanonicalPath
    +    val path = s"$rootPath/examples/spark2/src/main/resources/dataSample.csv"
    +
    +    // load 4 segments
    +    // scalastyle:off
    +    (1 to 4).foreach(_ => spark.sql(
    +      s"""
    +         | LOAD DATA LOCAL INPATH '$path'
    +         | INTO TABLE custom_compaction_table
    +         | OPTIONS('HEADER'='true')
    +       """.stripMargin))
    +    // scalastyle:on
    +
    +    // show all segments: 0,1,2,3
    +    spark.sql("SHOW SEGMENTS FOR TABLE custom_compaction_table").show()
    +
    +    // do custom compaction, segments specified will be merged
    +    spark.sql("ALTER TABLE custom_compaction_table COMPACT 'CUSTOM' WHERE SEGMENT.ID IN (1,2)")
    +    spark.sql("SHOW SEGMENTS FOR TABLE custom_compaction_table").show()
    +
    +    CarbonProperties.getInstance().addProperty(
    --- End diff --
   
    This Property is set to non-default value in the beginning of method 'exampleBody' . To ensure the completeness of this test case, the property is set back to default value, though it seems to be redundant.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2620: [CARBONDATA-2839] Add custom compaction example

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user Xaprice commented on the issue:

    https://github.com/apache/carbondata/pull/2620
 
    @chenliang613, please take a look.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2620: [CARBONDATA-2839] Add custom compaction example

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user Xaprice commented on the issue:

    https://github.com/apache/carbondata/pull/2620
 
    retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2620: [CARBONDATA-2839] Add custom compaction example

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user Xaprice commented on the issue:

    https://github.com/apache/carbondata/pull/2620
 
    retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2620: [CARBONDATA-2839] Add custom compaction example

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2620
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6242/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2620: [CARBONDATA-2839] Add custom compaction example

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2620
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7880/



---
12