[GitHub] carbondata pull request #1613: [CARBONDATA-1737] [CARBONDATA-1760] Fixed par...

classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1613: [CARBONDATA-1737] [CARBONDATA-1760] Fixed par...

qiuchenjian-2
GitHub user kunal642 opened a pull request:

    https://github.com/apache/carbondata/pull/1613

    [CARBONDATA-1737] [CARBONDATA-1760] Fixed partial load issue if user has set segments to access on parent table

    Analysis: Partial load was happening on pre-aggregate table when the user has set segments to access for the parent table.
   
    Solution: Set segments to access to * before firing load for child table.
   
    Be sure to do all of the following checklist to help us incorporate
    your contribution quickly and easily:
   
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
   
     - [ ] Testing done
            Please provide details on
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kunal642/carbondata preagg_loading_fix

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/1613.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1613
   
----
commit ac5d4fa6bf843810422f04c9cda90dc3ac3224c2
Author: kunal642 <[hidden email]>
Date:   2017-12-05T11:31:46Z

    fixed partial load issue if user has set segments to access on parent table

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1613: [CARBONDATA-1737] [CARBONDATA-1760] [PreAgg] Fixed p...

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1613
 
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/456/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1613: [CARBONDATA-1737] [CARBONDATA-1760] [PreAgg] Fixed p...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1613
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1724/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1613: [CARBONDATA-1737] [CARBONDATA-1760] [PreAgg] ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1613#discussion_r154977110
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/CreatePreAggregateTableCommand.scala ---
    @@ -124,19 +122,15 @@ case class CreatePreAggregateTableCommand(
         val loadAvailable = SegmentStatusManager.readLoadMetadata(parentCarbonTable.getMetaDataFilepath)
           .nonEmpty
         if (loadAvailable) {
    -      val headers = parentCarbonTable.getTableInfo.getFactTable.getListOfColumns.
    -        asScala.map(_.getColumnName).mkString(",")
    -      val childDataFrame = sparkSession.sql(
    -        new CarbonSpark2SqlParser().addPreAggLoadFunction(queryString))
    -      CarbonLoadDataCommand(tableIdentifier.database,
    -        tableIdentifier.table,
    -        null,
    -        Nil,
    -        Map("fileheader" -> headers),
    -        isOverwriteTable = false,
    -        dataFrame = Some(childDataFrame),
    -        internalOptions = Map(CarbonCommonConstants.IS_INTERNAL_LOAD_CALL -> "true")).
    -        run(sparkSession)
    +      // Passing segmentToLoad as * because we want to load all the segments into the
    +      // pre-aggregate table even if the user has set some segments on the parent table.
    +      PreAggregateUtil
    +        .startDataLoadForDataMap(parentCarbonTable,
    --- End diff --
   
    follow coding style:
    ```
        a.foo(
           paramA,
           paramB)
    ```


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1613: [CARBONDATA-1737] [CARBONDATA-1760] [PreAgg] ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1613#discussion_r154977309
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateUtil.scala ---
    @@ -493,4 +495,49 @@ object PreAggregateUtil {
         updatedPlan
       }
     
    +  /**
    +   * This method will start load process on the data map
    +   */
    +  def startDataLoadForDataMap(parentCarbonTable: CarbonTable,
    --- End diff --
   
    move first parameter to next line


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1613: [CARBONDATA-1737] [CARBONDATA-1760] [PreAgg] ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1613#discussion_r154977627
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateUtil.scala ---
    @@ -493,4 +495,49 @@ object PreAggregateUtil {
         updatedPlan
       }
     
    +  /**
    +   * This method will start load process on the data map
    +   */
    +  def startDataLoadForDataMap(parentCarbonTable: CarbonTable,
    +      dataMapIdentifier: TableIdentifier,
    +      queryString: String,
    +      segmentToLoad: String,
    +      validateSegments: Boolean,
    +      sparkSession: SparkSession): Unit = {
    +    CarbonSession.threadSet(
    --- End diff --
   
    Do not use thread local to pass parameters, pass them explicitly to the command


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1613: [CARBONDATA-1737] [CARBONDATA-1760] [PreAgg] Fixed p...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1613
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2106/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1613: [CARBONDATA-1737] [CARBONDATA-1760] [PreAgg] Fixed p...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1613
 
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/487/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1613: [CARBONDATA-1737] [CARBONDATA-1760] [PreAgg] ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kunal642 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1613#discussion_r155155362
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateUtil.scala ---
    @@ -493,4 +495,49 @@ object PreAggregateUtil {
         updatedPlan
       }
     
    +  /**
    +   * This method will start load process on the data map
    +   */
    +  def startDataLoadForDataMap(parentCarbonTable: CarbonTable,
    +      dataMapIdentifier: TableIdentifier,
    +      queryString: String,
    +      segmentToLoad: String,
    +      validateSegments: Boolean,
    +      sparkSession: SparkSession): Unit = {
    +    CarbonSession.threadSet(
    --- End diff --
   
    This parameter is used to specify the segments to scan in CarbonScanRDD. Therefore i cannot pass it explicitly. As discussed with @gvramana and @ravipesala this was only option that was agreed upon.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1613: [CARBONDATA-1737] [CARBONDATA-1760] [PreAgg] Fixed p...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1613
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1751/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1613: [CARBONDATA-1737] [CARBONDATA-1760] [PreAgg] Fixed p...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1613
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2134/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1613: [CARBONDATA-1737] [CARBONDATA-1760] [PreAgg] Fixed p...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1613
 
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/497/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1613: [CARBONDATA-1737] [CARBONDATA-1760] [PreAgg] Fixed p...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1613
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1760/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1613: [CARBONDATA-1737] [CARBONDATA-1760] [PreAgg] Fixed p...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kunal642 commented on the issue:

    https://github.com/apache/carbondata/pull/1613
 
    retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1613: [CARBONDATA-1737] [CARBONDATA-1760] [PreAgg] Fixed p...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1613
 
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/509/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1613: [CARBONDATA-1737] [CARBONDATA-1760] [PreAgg] ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1613#discussion_r155249189
 
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonScanRDD.scala ---
    @@ -359,14 +359,24 @@ class CarbonScanRDD(
         }
         val carbonSessionInfo = ThreadLocalSessionInfo.getCarbonSessionInfo
         if (carbonSessionInfo != null) {
    -      CarbonTableInputFormat.setAggeragateTableSegments(conf, carbonSessionInfo.getSessionParams
    -        .getProperty(CarbonCommonConstants.CARBON_INPUT_SEGMENTS +
    -                     identifier.getCarbonTableIdentifier.getDatabaseName + "." +
    -                     identifier.getCarbonTableIdentifier.getTableName, ""))
    -      CarbonTableInputFormat.setValidateSegmentsToAccess(conf, carbonSessionInfo.getSessionParams
    +      if (carbonSessionInfo.getSessionParams
    +            .getProperty(CarbonCommonConstants.CARBON_INPUT_SEGMENTS +
    +                         identifier.getCarbonTableIdentifier.getDatabaseName + "." +
    +                         identifier.getCarbonTableIdentifier.getTableName) != null) {
    +        CarbonTableInputFormat.setAggeragateTableSegments(conf, carbonSessionInfo.getSessionParams
    +          .getProperty(CarbonCommonConstants.CARBON_INPUT_SEGMENTS +
    +                       identifier.getCarbonTableIdentifier.getDatabaseName + "." +
    +                       identifier.getCarbonTableIdentifier.getTableName))
    +      }
    +      if (carbonSessionInfo.getSessionParams
    +            .getProperty(CarbonCommonConstants.VALIDATE_CARBON_INPUT_SEGMENTS +
    +                         identifier.getCarbonTableIdentifier.getDatabaseName + "." +
    +                         identifier.getCarbonTableIdentifier.getTableName) != null) {
    --- End diff --
   
    Please take this to variable and reuse it


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1613: [CARBONDATA-1737] [CARBONDATA-1760] [PreAgg] Fixed p...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1613
 
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/512/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1613: [CARBONDATA-1737] [CARBONDATA-1760] [PreAgg] Fixed p...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1613
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2143/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1613: [CARBONDATA-1737] [CARBONDATA-1760] [PreAgg] Fixed p...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1613
 
    LGTM


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1613: [CARBONDATA-1737] [CARBONDATA-1760] [PreAgg] Fixed p...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1613
 
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/516/



---
12