GitHub user ravipesala opened a pull request:
https://github.com/apache/carbondata/pull/1677 [CARBONDATA-1860][PARTITION] Support insertoverwrite for a specific partition. User should able to overwrite partition for a specific partition. Like INSERT OVERWRITE TABLE partitioned_user PARTITION (country = 'US') SELECT * FROM another_user au WHERE au.country = 'US'; In the above example, the user can overwrite only the partition(country = 'US') data. So remaining partitions data would be intact. While overwriting a specific partition carbon should first load data to the new segment and drop that partition from all remaining segments using partition.map file. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [X] Any interfaces changed? NO - [X] Any backward compatibility impacted? NO - [X] Document update required? YES - [X] Testing done Tests added - [X] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravipesala/incubator-carbondata partition-overwrite Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1677.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1677 ---- commit 8c0673f4a8b526f665087366bf3ac4be9a19e9c0 Author: ravipesala <[hidden email]> Date: 2017-12-18T17:07:59Z Added support to read partitions commit 33f61d756ec49e19ff01e646ccc70af5824a3e8e Author: ravipesala <[hidden email]> Date: 2017-12-16T17:08:00Z Added drop partition feature commit 9ef28c86b457c0b9ea066912f80745b11733b547 Author: ravipesala <[hidden email]> Date: 2017-12-19T07:49:15Z Support insert overwrite partition ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1677 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/900/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1677 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2127/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1677 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2410/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1677 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/903/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1677 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2411/ --- |
In reply to this post by qiuchenjian-2
Github user gvramana commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1677#discussion_r157829373 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala --- @@ -597,14 +602,18 @@ case class CarbonLoadDataCommand( } val partitionSchema = StructType(table.getPartitionInfo(table.getTableName).getColumnSchemaList.asScala.map(field => - metastoreSchema.fields.find(_.name.equalsIgnoreCase(field.getColumnName))).map(_.get)) - + metastoreSchema.fields.find(_.name.equalsIgnoreCase(field.getColumnName))).map(_.get)) + val overWriteLocal = if (overWrite && partition.nonEmpty) { --- End diff -- In case of dynamic partition overwrite, existing partitions which are not present in current load won't be deleted so handle them. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1677 Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/921/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1677 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2435/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1677 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2147/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1677 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/934/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1677 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2163/ --- |
In reply to this post by qiuchenjian-2
Github user gvramana commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1677#discussion_r158015203 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala --- @@ -567,14 +568,18 @@ case class CarbonLoadDataCommand( carbonLoadModel, sparkSession) } - Dataset.ofRows( - sparkSession, + val convertedPlan = CarbonReflectionUtils.getInsertIntoCommand( convertRelation, partition, query, - isOverwriteTable, - false)) + false, + false) + if (isOverwriteTable && partition.nonEmpty && table.isHivePartitionTable) { --- End diff -- isHivePartitionTable command check is not required as it is already done --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1677#discussion_r158099643 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala --- @@ -567,14 +568,18 @@ case class CarbonLoadDataCommand( carbonLoadModel, sparkSession) } - Dataset.ofRows( - sparkSession, + val convertedPlan = CarbonReflectionUtils.getInsertIntoCommand( convertRelation, partition, query, - isOverwriteTable, - false)) + false, + false) + if (isOverwriteTable && partition.nonEmpty && table.isHivePartitionTable) { --- End diff -- ok --- |
In reply to this post by qiuchenjian-2
Github user ravipesala closed the pull request at:
https://github.com/apache/carbondata/pull/1677 --- |
In reply to this post by qiuchenjian-2
GitHub user ravipesala reopened a pull request:
https://github.com/apache/carbondata/pull/1677 [CARBONDATA-1860][PARTITION] Support insertoverwrite for a specific partition. This PR depends on https://github.com/apache/carbondata/pull/1672 and https://github.com/apache/carbondata/pull/1674 User should able to overwrite partition for a specific partition. Like INSERT OVERWRITE TABLE partitioned_user PARTITION (country = 'US') SELECT * FROM another_user au WHERE au.country = 'US'; In the above example, the user can overwrite only the partition(country = 'US') data. So remaining partitions data would be intact. While overwriting a specific partition carbon should first load data to the new segment and drop that partition from all remaining segments using partition.map file. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [X] Any interfaces changed? NO - [X] Any backward compatibility impacted? NO - [X] Document update required? YES - [X] Testing done Tests added - [X] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravipesala/incubator-carbondata partition-overwrite Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1677.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1677 ---- commit 32e23c7e0d1dfb0435ae70b6d1311e68cec4c615 Author: ravipesala <ravi.pesala@...> Date: 2017-12-19T07:49:15Z Support insert overwrite partition commit 9f0b7d8b1d28cd452633762057d8c7204765e816 Author: ravipesala <ravi.pesala@...> Date: 2017-12-20T18:07:30Z handle comments ---- --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1677 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2191/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1677 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/968/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1677 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2463/ --- |
In reply to this post by qiuchenjian-2
Github user gvramana commented on the issue:
https://github.com/apache/carbondata/pull/1677 LGTM --- |
Free forum by Nabble | Edit this page |