GitHub user Xaprice opened a pull request:
https://github.com/apache/carbondata/pull/2620 [CARBONDATA-2839] Add custom compaction example Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ x ] Any interfaces changed? no - [ x ] Any backward compatibility impacted? no - [ x ] Document update required? no - [ x ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ x ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. small change You can merge this pull request into a Git repository by running: $ git pull https://github.com/Xaprice/carbondata custom_compaction_example Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2620.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2620 ---- commit 0402a5f1f66027e5b7d72b514eb80b09c2d7222e Author: Jin Zhou <xaprice@...> Date: 2018-08-08T09:01:43Z [CARBONDATA-2839] Add custom compaction example ---- --- |
Github user zzcclp commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2620#discussion_r208523007 --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/CustomCompactionExample.scala --- @@ -0,0 +1,69 @@ +package org.apache.carbondata.examples + +import java.io.File + +import org.apache.spark.sql.SparkSession + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties +import org.apache.carbondata.examples.util.ExampleUtils + + +object CustomCompactionExample { + + def main(args: Array[String]): Unit = { + val spark = ExampleUtils.createCarbonSession("CustomCompactionExample") + exampleBody(spark) + spark.close() + } + + def exampleBody(spark : SparkSession): Unit = { + CarbonProperties.getInstance() + .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "yyyy/MM/dd") + + spark.sql("DROP TABLE IF EXISTS custom_compaction_table") + + spark.sql( + s""" + | CREATE TABLE IF NOT EXISTS custom_compaction_table( + | ID Int, + | date Date, + | country String, + | name String, + | phonetype String, + | serialname String, + | salary Int, + | floatField float + | ) STORED BY 'carbondata' + """.stripMargin) + + val rootPath = new File(this.getClass.getResource("/").getPath + + "../../../..").getCanonicalPath + val path = s"$rootPath/examples/spark2/src/main/resources/dataSample.csv" + + // load 4 segments + // scalastyle:off + (1 to 4).foreach(_ => spark.sql( + s""" + | LOAD DATA LOCAL INPATH '$path' + | INTO TABLE custom_compaction_table + | OPTIONS('HEADER'='true') + """.stripMargin)) + // scalastyle:on + + // show all segments: 0,1,2,3 + spark.sql("SHOW SEGMENTS FOR TABLE custom_compaction_table").show() + + // do custom compaction, segments specified will be merged + spark.sql("ALTER TABLE custom_compaction_table COMPACT 'CUSTOM' WHERE SEGMENT.ID IN (1,2)") + spark.sql("SHOW SEGMENTS FOR TABLE custom_compaction_table").show() + + CarbonProperties.getInstance().addProperty( --- End diff -- why set this property here? --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2620 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6210/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2620 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7836/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2620 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6561/ --- |
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:
https://github.com/apache/carbondata/pull/2620 retest this please --- |
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2620#discussion_r208780123 --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/CustomCompactionExample.scala --- @@ -0,0 +1,69 @@ +package org.apache.carbondata.examples --- End diff -- please add the apache license header --- |
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2620#discussion_r208780252 --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/CustomCompactionExample.scala --- @@ -0,0 +1,69 @@ +package org.apache.carbondata.examples + +import java.io.File + +import org.apache.spark.sql.SparkSession + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties +import org.apache.carbondata.examples.util.ExampleUtils + + --- End diff -- please add the description for explaining the example. --- |
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2620#discussion_r208780697 --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/CustomCompactionExample.scala --- @@ -0,0 +1,69 @@ +package org.apache.carbondata.examples + +import java.io.File + +import org.apache.spark.sql.SparkSession + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties +import org.apache.carbondata.examples.util.ExampleUtils + + +object CustomCompactionExample { + + def main(args: Array[String]): Unit = { + val spark = ExampleUtils.createCarbonSession("CustomCompactionExample") + exampleBody(spark) + spark.close() + } + + def exampleBody(spark : SparkSession): Unit = { + CarbonProperties.getInstance() + .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "yyyy/MM/dd") + + spark.sql("DROP TABLE IF EXISTS custom_compaction_table") + + spark.sql( + s""" + | CREATE TABLE IF NOT EXISTS custom_compaction_table( + | ID Int, + | date Date, + | country String, + | name String, + | phonetype String, + | serialname String, + | salary Int, + | floatField float + | ) STORED BY 'carbondata' + """.stripMargin) + + val rootPath = new File(this.getClass.getResource("/").getPath + + "../../../..").getCanonicalPath + val path = s"$rootPath/examples/spark2/src/main/resources/dataSample.csv" + + // load 4 segments + // scalastyle:off + (1 to 4).foreach(_ => spark.sql( + s""" + | LOAD DATA LOCAL INPATH '$path' + | INTO TABLE custom_compaction_table + | OPTIONS('HEADER'='true') + """.stripMargin)) + // scalastyle:on + + // show all segments: 0,1,2,3 + spark.sql("SHOW SEGMENTS FOR TABLE custom_compaction_table").show() + + // do custom compaction, segments specified will be merged + spark.sql("ALTER TABLE custom_compaction_table COMPACT 'CUSTOM' WHERE SEGMENT.ID IN (1,2)") + spark.sql("SHOW SEGMENTS FOR TABLE custom_compaction_table").show() + + CarbonProperties.getInstance().addProperty( + CarbonCommonConstants.CARBON_DATE_FORMAT, + CarbonCommonConstants.CARBON_DATE_DEFAULT_FORMAT) + --- End diff -- After custom compaction, please query table data once to check the data if it is correct? --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2620 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7846/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2620 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6571/ --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:
https://github.com/apache/carbondata/pull/2620 What does âCUSTOM COMPACTIONâ mean here? --- |
In reply to this post by qiuchenjian-2
Github user zzcclp commented on the issue:
https://github.com/apache/carbondata/pull/2620 @xuchuanyin it means that it can compact specified segments which are assigned by users: `ALTER TABLE custom_compaction_table COMPACT 'CUSTOM' WHERE SEGMENT.ID IN (1,2)` --- |
In reply to this post by qiuchenjian-2
Github user Xaprice commented on the issue:
https://github.com/apache/carbondata/pull/2620 @xuchuanyin, âCUSTOM COMPACTIONâ is a new compaction type in addition to MAJOR and MINOR COMPACTION. When doing custom compaction, user can directly specify segment ids to be merged. --- |
In reply to this post by qiuchenjian-2
Github user Xaprice commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2620#discussion_r208840858 --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/CustomCompactionExample.scala --- @@ -0,0 +1,69 @@ +package org.apache.carbondata.examples + +import java.io.File + +import org.apache.spark.sql.SparkSession + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties +import org.apache.carbondata.examples.util.ExampleUtils + + +object CustomCompactionExample { + + def main(args: Array[String]): Unit = { + val spark = ExampleUtils.createCarbonSession("CustomCompactionExample") + exampleBody(spark) + spark.close() + } + + def exampleBody(spark : SparkSession): Unit = { + CarbonProperties.getInstance() + .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "yyyy/MM/dd") + + spark.sql("DROP TABLE IF EXISTS custom_compaction_table") + + spark.sql( + s""" + | CREATE TABLE IF NOT EXISTS custom_compaction_table( + | ID Int, + | date Date, + | country String, + | name String, + | phonetype String, + | serialname String, + | salary Int, + | floatField float + | ) STORED BY 'carbondata' + """.stripMargin) + + val rootPath = new File(this.getClass.getResource("/").getPath + + "../../../..").getCanonicalPath + val path = s"$rootPath/examples/spark2/src/main/resources/dataSample.csv" + + // load 4 segments + // scalastyle:off + (1 to 4).foreach(_ => spark.sql( + s""" + | LOAD DATA LOCAL INPATH '$path' + | INTO TABLE custom_compaction_table + | OPTIONS('HEADER'='true') + """.stripMargin)) + // scalastyle:on + + // show all segments: 0,1,2,3 + spark.sql("SHOW SEGMENTS FOR TABLE custom_compaction_table").show() + + // do custom compaction, segments specified will be merged + spark.sql("ALTER TABLE custom_compaction_table COMPACT 'CUSTOM' WHERE SEGMENT.ID IN (1,2)") + spark.sql("SHOW SEGMENTS FOR TABLE custom_compaction_table").show() + + CarbonProperties.getInstance().addProperty( --- End diff -- This Property is set to non-default value in the beginning of method 'exampleBody' . To ensure the completeness of this test case, the property is set back to default value, though it seems to be redundant. --- |
In reply to this post by qiuchenjian-2
Github user Xaprice commented on the issue:
https://github.com/apache/carbondata/pull/2620 @chenliang613, please take a look. --- |
In reply to this post by qiuchenjian-2
Github user Xaprice commented on the issue:
https://github.com/apache/carbondata/pull/2620 retest this please --- |
In reply to this post by qiuchenjian-2
Github user Xaprice commented on the issue:
https://github.com/apache/carbondata/pull/2620 retest this please --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2620 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6242/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2620 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7880/ --- |
Free forum by Nabble | Edit this page |