[GitHub] carbondata pull request #1605: [WIP] added support to compact segments in pr...

classic Classic list List threaded Threaded
85 messages Options
12345
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1605: [WIP] added support to compact segments in pr...

qiuchenjian-2
GitHub user kunal642 opened a pull request:

    https://github.com/apache/carbondata/pull/1605

    [WIP] added support to compact segments in pre-agg table

    Be sure to do all of the following checklist to help us incorporate
    your contribution quickly and easily:
   
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
   
     - [ ] Testing done
            Please provide details on
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kunal642/carbondata preagg_compaction_support

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/1605.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1605
   
----
commit cc58e9d2d26a99465b2daa3af1a24777581718cf
Author: kunal642 <[hidden email]>
Date:   2017-11-22T14:03:37Z

    added support to compact segments in pre-agg table

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1605: [WIP] added support to compact segments in pre-agg t...

qiuchenjian-2
Github user kunal642 commented on the issue:

    https://github.com/apache/carbondata/pull/1605
 
    @jackylk Can you please start the first level review.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1605: [WIP] added support to compact segments in pre-agg t...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1605
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2051/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1605: [WIP] added support to compact segments in pre-agg t...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1605
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1661/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1605: [WIP] added support to compact segments in pre-agg t...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1605
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2054/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1605: [WIP] added support to compact segments in pre-agg t...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1605
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1667/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1605: [CARBONDATA-1526] [PreAgg] Added support to compact ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1605
 
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/417/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1605: [CARBONDATA-1526] [PreAgg] Added support to compact ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1605
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2075/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1605: [CARBONDATA-1526] [PreAgg] Added support to compact ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1605
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1689/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1605: [CARBONDATA-1526] [PreAgg] Added support to compact ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1605
 
    @kunal642 Please add the description of the work done in this PR.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1605: [CARBONDATA-1526] [PreAgg] Added support to compact ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kunal642 commented on the issue:

    https://github.com/apache/carbondata/pull/1605
 
    @ravipesala description added


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1605: [CARBONDATA-1526] [PreAgg] Added support to c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1605#discussion_r154962499
 
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/events/AlterTableEvents.scala ---
    @@ -133,25 +133,25 @@ case class AlterTableRenameAbortEvent(carbonTable: CarbonTable,
     /**
      *
      * @param carbonTable
    - * @param carbonLoadModel
    + * @param carbonMergerMapping
    --- End diff --
   
    remove all these parameter comment if there is no description


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1605: [CARBONDATA-1526] [PreAgg] Added support to c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1605#discussion_r154962772
 
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/events/AlterTableEvents.scala ---
    @@ -133,25 +133,25 @@ case class AlterTableRenameAbortEvent(carbonTable: CarbonTable,
     /**
      *
      * @param carbonTable
    - * @param carbonLoadModel
    + * @param carbonMergerMapping
      * @param mergedLoadName
      * @param sQLContext
      */
    -case class AlterTableCompactionPreEvent(sparkSession: SparkSession, carbonTable: CarbonTable,
    -    carbonLoadModel: CarbonLoadModel,
    +case class AlterTableCompactionPreEvent(carbonTable: CarbonTable,
    --- End diff --
   
    move `carbonTable` to next line


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1605: [CARBONDATA-1526] [PreAgg] Added support to c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1605#discussion_r154962941
 
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/events/AlterTableEvents.scala ---
    @@ -133,25 +133,25 @@ case class AlterTableRenameAbortEvent(carbonTable: CarbonTable,
     /**
      *
      * @param carbonTable
    - * @param carbonLoadModel
    + * @param carbonMergerMapping
      * @param mergedLoadName
      * @param sQLContext
      */
    -case class AlterTableCompactionPreEvent(sparkSession: SparkSession, carbonTable: CarbonTable,
    -    carbonLoadModel: CarbonLoadModel,
    +case class AlterTableCompactionPreEvent(carbonTable: CarbonTable,
    +    carbonMergerMapping: CarbonMergerMapping,
         mergedLoadName: String,
         sQLContext: SQLContext) extends Event with AlterTableCompactionEventInfo
     
     
     /**
      *
      * @param carbonTable
    - * @param carbonLoadModel
    + * @param carbonMergerMapping
    --- End diff --
   
    Add description of all these parameters, same for all related events


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1605: [CARBONDATA-1526] [PreAgg] Added support to c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1605#discussion_r154963700
 
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/events/LoadEvents.scala ---
    @@ -42,6 +42,14 @@ case class LoadTablePostExecutionEvent(sparkSession: SparkSession,
         carbonTableIdentifier: CarbonTableIdentifier,
         carbonLoadModel: CarbonLoadModel) extends Event with LoadEventInfo
     
    +/**
    + * Class for handling operations after data load completion and before final commit of load
    + * operation. Example usage: For loading pre-aggregate tables
    + */
    +case class LoadTablePreStatusUpdateEvent(sparkSession: SparkSession,
    --- End diff --
   
    The usage of this is not clear, why is it needed


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1605: [CARBONDATA-1526] [PreAgg] Added support to c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1605#discussion_r154964861
 
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/Compactor.scala ---
    @@ -69,11 +69,7 @@ object Compactor {
         // trigger event for compaction
         val operationContext = new OperationContext
         val alterTableCompactionPreEvent: AlterTableCompactionPreEvent =
    -      AlterTableCompactionPreEvent(compactionCallableModel.sqlContext.sparkSession,
    -        carbonTable,
    -        carbonLoadModel,
    -        mergedLoadName,
    -        sc)
    +      AlterTableCompactionPreEvent(carbonTable, carbonMergerMapping, mergedLoadName, sc)
    --- End diff --
   
    please modify line 55, it is not storePath, it is tablePath


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1605: [CARBONDATA-1526] [PreAgg] Added support to c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1605#discussion_r154965194
 
    --- Diff: integration/spark2/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ---
    @@ -492,6 +492,10 @@ object CarbonDataRDDFactory {
             throw new Exception("No Data to load")
           }
           writeDictionary(carbonLoadModel, result, writeAll = false)
    +      val loadTablePreStatusUpdateEvent = LoadTablePreStatusUpdateEvent(sqlContext.sparkSession,
    --- End diff --
   
    move parameter to next line


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1605: [CARBONDATA-1526] [PreAgg] Added support to c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1605#discussion_r154965672
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonAlterTableCompactionCommand.scala ---
    @@ -159,7 +161,14 @@ case class CarbonAlterTableCompactionCommand(
         // if system level compaction is enabled then only one compaction can run in the system
         // if any other request comes at this time then it will create a compaction request file.
         // so that this will be taken up by the compaction process which is executing.
    -    if (!isConcurrentCompactionAllowed) {
    +    if (carbonTable.isChildDataMap) {
    --- End diff --
   
    please modify the comment start from line 161, put them in correct place and add comment for this newly added if condition


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1605: [CARBONDATA-1526] [PreAgg] Added support to c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1605#discussion_r154966099
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonAlterTableCompactionCommand.scala ---
    @@ -159,7 +161,14 @@ case class CarbonAlterTableCompactionCommand(
         // if system level compaction is enabled then only one compaction can run in the system
         // if any other request comes at this time then it will create a compaction request file.
         // so that this will be taken up by the compaction process which is executing.
    -    if (!isConcurrentCompactionAllowed) {
    +    if (carbonTable.isChildDataMap) {
    +      carbonLoadModel.setCompactionType(alterTableModel.compactionType.toUpperCase match {
    --- End diff --
   
    change `setCompactionType` to accept enum instead of String


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1605: [CARBONDATA-1526] [PreAgg] Added support to c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1605#discussion_r154968981
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonAlterTableCompactionCommand.scala ---
    @@ -212,4 +221,56 @@ case class CarbonAlterTableCompactionCommand(
           }
         }
       }
    +
    +  private def startCompactionForDataMap(carbonLoadModel: CarbonLoadModel,
    +      sparkSession: SparkSession): Unit = {
    +    val carbonTable = carbonLoadModel.getCarbonDataLoadSchema.getCarbonTable
    +    val loadMetaDataDetails = CarbonDataMergerUtil
    +      .identifySegmentsToBeMerged(carbonLoadModel,
    +        CarbonDataMergerUtil.getCompactionSize(CompactionType.MAJOR),
    +        carbonLoadModel.getLoadMetadataDetails,
    +        carbonLoadModel.getCompactionType)
    +    val segments = loadMetaDataDetails.asScala.map(_.getLoadName)
    +    if (segments.nonEmpty) {
    +      CarbonSession
    +        .threadSet(CarbonCommonConstants.CARBON_INPUT_SEGMENTS +
    --- End diff --
   
    Do not pass parameters in this way, pass the parameter to the function you need


---
12345