[GitHub] carbondata pull request #1677: [CARBONDATA-1860][PARTITION] Support insertov...

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1677: [CARBONDATA-1860][PARTITION] Support insertov...

qiuchenjian-2
GitHub user ravipesala opened a pull request:

    https://github.com/apache/carbondata/pull/1677

    [CARBONDATA-1860][PARTITION] Support insertoverwrite for a specific partition.

    User should able to overwrite partition for a specific partition. Like
    INSERT OVERWRITE TABLE partitioned_user
          PARTITION (country = 'US')
          SELECT * FROM another_user au
          WHERE au.country = 'US';
    In the above example, the user can overwrite only the partition(country = 'US') data. So remaining partitions data would be intact.
    While overwriting a specific partition carbon should first load data to the new segment and drop that partition from all remaining segments using partition.map file.
   
    Be sure to do all of the following checklist to help us incorporate
    your contribution quickly and easily:
   
     - [X] Any interfaces changed? NO
     
     - [X] Any backward compatibility impacted? NO
     
     - [X] Document update required? YES
   
     - [X] Testing done
           Tests added
     - [X] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ravipesala/incubator-carbondata partition-overwrite

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/1677.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1677
   
----
commit 8c0673f4a8b526f665087366bf3ac4be9a19e9c0
Author: ravipesala <[hidden email]>
Date:   2017-12-18T17:07:59Z

    Added support to read partitions

commit 33f61d756ec49e19ff01e646ccc70af5824a3e8e
Author: ravipesala <[hidden email]>
Date:   2017-12-16T17:08:00Z

    Added drop partition feature

commit 9ef28c86b457c0b9ea066912f80745b11733b547
Author: ravipesala <[hidden email]>
Date:   2017-12-19T07:49:15Z

    Support insert overwrite partition

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1677: [CARBONDATA-1860][PARTITION] Support insertoverwrite...

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1677
 
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/900/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1677: [CARBONDATA-1860][PARTITION] Support insertoverwrite...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1677
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2127/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1677: [CARBONDATA-1860][PARTITION] Support insertoverwrite...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1677
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2410/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1677: [CARBONDATA-1860][PARTITION] Support insertoverwrite...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1677
 
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/903/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1677: [CARBONDATA-1860][PARTITION] Support insertoverwrite...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1677
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2411/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1677: [CARBONDATA-1860][PARTITION] Support insertov...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1677#discussion_r157829373
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala ---
    @@ -597,14 +602,18 @@ case class CarbonLoadDataCommand(
         }
         val partitionSchema =
           StructType(table.getPartitionInfo(table.getTableName).getColumnSchemaList.asScala.map(field =>
    -      metastoreSchema.fields.find(_.name.equalsIgnoreCase(field.getColumnName))).map(_.get))
    -
    +        metastoreSchema.fields.find(_.name.equalsIgnoreCase(field.getColumnName))).map(_.get))
    +    val overWriteLocal = if (overWrite && partition.nonEmpty) {
    --- End diff --
   
    In case of dynamic partition overwrite, existing partitions which are not present in current load won't be deleted so handle them.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1677: [CARBONDATA-1860][PARTITION] Support insertoverwrite...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1677
 
    Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/921/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1677: [CARBONDATA-1860][PARTITION] Support insertoverwrite...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1677
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2435/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1677: [CARBONDATA-1860][PARTITION] Support insertoverwrite...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1677
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2147/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1677: [CARBONDATA-1860][PARTITION] Support insertoverwrite...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1677
 
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/934/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1677: [CARBONDATA-1860][PARTITION] Support insertoverwrite...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1677
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2163/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1677: [CARBONDATA-1860][PARTITION] Support insertov...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1677#discussion_r158015203
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala ---
    @@ -567,14 +568,18 @@ case class CarbonLoadDataCommand(
               carbonLoadModel,
               sparkSession)
         }
    -    Dataset.ofRows(
    -      sparkSession,
    +    val convertedPlan =
           CarbonReflectionUtils.getInsertIntoCommand(
             convertRelation,
             partition,
             query,
    -        isOverwriteTable,
    -        false))
    +        false,
    +        false)
    +    if (isOverwriteTable && partition.nonEmpty && table.isHivePartitionTable) {
    --- End diff --
   
    isHivePartitionTable command check is not required as it is already done


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1677: [CARBONDATA-1860][PARTITION] Support insertov...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1677#discussion_r158099643
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala ---
    @@ -567,14 +568,18 @@ case class CarbonLoadDataCommand(
               carbonLoadModel,
               sparkSession)
         }
    -    Dataset.ofRows(
    -      sparkSession,
    +    val convertedPlan =
           CarbonReflectionUtils.getInsertIntoCommand(
             convertRelation,
             partition,
             query,
    -        isOverwriteTable,
    -        false))
    +        false,
    +        false)
    +    if (isOverwriteTable && partition.nonEmpty && table.isHivePartitionTable) {
    --- End diff --
   
    ok


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1677: [CARBONDATA-1860][PARTITION] Support insertov...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala closed the pull request at:

    https://github.com/apache/carbondata/pull/1677


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1677: [CARBONDATA-1860][PARTITION] Support insertov...

qiuchenjian-2
In reply to this post by qiuchenjian-2
GitHub user ravipesala reopened a pull request:

    https://github.com/apache/carbondata/pull/1677

    [CARBONDATA-1860][PARTITION] Support insertoverwrite for a specific partition.

    This PR depends on https://github.com/apache/carbondata/pull/1672 and https://github.com/apache/carbondata/pull/1674
    User should able to overwrite partition for a specific partition. Like
    INSERT OVERWRITE TABLE partitioned_user
          PARTITION (country = 'US')
          SELECT * FROM another_user au
          WHERE au.country = 'US';
    In the above example, the user can overwrite only the partition(country = 'US') data. So remaining partitions data would be intact.
    While overwriting a specific partition carbon should first load data to the new segment and drop that partition from all remaining segments using partition.map file.
   
    Be sure to do all of the following checklist to help us incorporate
    your contribution quickly and easily:
   
     - [X] Any interfaces changed? NO
     
     - [X] Any backward compatibility impacted? NO
     
     - [X] Document update required? YES
   
     - [X] Testing done
           Tests added
     - [X] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ravipesala/incubator-carbondata partition-overwrite

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/1677.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1677
   
----
commit 32e23c7e0d1dfb0435ae70b6d1311e68cec4c615
Author: ravipesala <ravi.pesala@...>
Date:   2017-12-19T07:49:15Z

    Support insert overwrite partition

commit 9f0b7d8b1d28cd452633762057d8c7204765e816
Author: ravipesala <ravi.pesala@...>
Date:   2017-12-20T18:07:30Z

    handle comments

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1677: [CARBONDATA-1860][PARTITION] Support insertoverwrite...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1677
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2191/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1677: [CARBONDATA-1860][PARTITION] Support insertoverwrite...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1677
 
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/968/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1677: [CARBONDATA-1860][PARTITION] Support insertoverwrite...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1677
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2463/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1677: [CARBONDATA-1860][PARTITION] Support insertoverwrite...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user gvramana commented on the issue:

    https://github.com/apache/carbondata/pull/1677
 
    LGTM


---
12