GitHub user sounakr opened a pull request:
https://github.com/apache/carbondata/pull/2177 [WIP][Unmanaged Table] Insert into Unmanaged Table Insert into Unmanaged Table. - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/sounakr/incubator-carbondata InsertInto Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2177.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2177 ---- commit b22e0052d7f204309b02c11b4b0f7bd685c1e56c Author: sounakr <sounakr@...> Date: 2018-04-17T08:08:51Z Insert into Unmanaged Table ---- --- |
Github user jackylk commented on the issue:
https://github.com/apache/carbondata/pull/2177 Please change the term "unmanaged table" to "external table" --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2177#discussion_r181993630 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestUnmanagedCarbonTable.scala --- @@ -353,13 +389,17 @@ class TestUnmanagedCarbonTable extends QueryTest with BeforeAndAfterAll { Assert.assertNotEquals(1, dataFiles.length) sql("DROP TABLE IF EXISTS sdkOutputTable") + sql("DROP TABLE IF EXISTS t1") sql( s"""CREATE EXTERNAL TABLE sdkOutputTable STORED BY 'carbondata' LOCATION |'$writerPath' """.stripMargin) checkAnswer(sql("select count(*) from sdkOutputTable"), Seq(Row(1000000))) + + --- End diff -- remove empty line --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2177 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3877/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2177 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5099/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2177 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3889/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2177 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5111/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2177 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3937/ --- |
In reply to this post by qiuchenjian-2
Github user sounakr commented on the issue:
https://github.com/apache/carbondata/pull/2177 Retest this please --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2177 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4027/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2177 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5222/ --- |
In reply to this post by qiuchenjian-2
Github user gvramana commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2177#discussion_r182968973 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/NewCarbonDataLoadRDD.scala --- @@ -381,8 +381,10 @@ class NewDataFrameLoaderRDD[K, V]( carbonLoadModel.getTableName + CarbonCommonConstants.UNDERSCORE + theSplit.index try { - loadMetadataDetails.setPartitionCount(CarbonTablePath.DEPRECATED_PATITION_ID) - loadMetadataDetails.setSegmentStatus(SegmentStatus.SUCCESS) + if (!carbonLoadModel.isCarbonNonTransactionalTable) { --- End diff -- this check should not be present, only writing to tablestatus needs to be avoided. All the remaining flow should be same. --- |
In reply to this post by qiuchenjian-2
Github user gvramana commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2177#discussion_r182969475 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/NewCarbonDataLoadRDD.scala --- @@ -425,9 +427,11 @@ class NewDataFrameLoaderRDD[K, V]( TableProcessingOperations.deleteLocalDataLoadFolderLocation(model, false, false) // in case of failure the same operation will be re-tried several times. // So print the data load statistics only in case of non failure case - if (SegmentStatus.LOAD_FAILURE != loadMetadataDetails.getSegmentStatus) { - CarbonTimeStatisticsFactory.getLoadStatisticsInstance - .printStatisticsInfo(CarbonTablePath.DEPRECATED_PATITION_ID) + if (!carbonLoadModel.isCarbonNonTransactionalTable) { --- End diff -- this check should not be present, logic should be same for transactional and non-transactional tables; --- |
In reply to this post by qiuchenjian-2
Github user gvramana commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2177#discussion_r183043670 --- Diff: processing/src/main/java/org/apache/carbondata/processing/util/CarbonLoaderUtil.java --- @@ -169,8 +169,28 @@ public static boolean recordNewLoadMetadata(LoadMetadataDetails newMetaEntry, * @throws IOException */ public static boolean recordNewLoadMetadata(LoadMetadataDetails newMetaEntry, - CarbonLoadModel loadModel, boolean loadStartEntry, boolean insertOverwrite, String uuid) + final CarbonLoadModel loadModel, boolean loadStartEntry, boolean insertOverwrite, String uuid) throws IOException { + if (loadModel.isCarbonNonTransactionalTable()) { --- End diff -- Function name says updation of metadata, so should not delete data in that --- |
In reply to this post by qiuchenjian-2
Github user sounakr commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2177#discussion_r183074326 --- Diff: processing/src/main/java/org/apache/carbondata/processing/util/CarbonLoaderUtil.java --- @@ -169,8 +169,28 @@ public static boolean recordNewLoadMetadata(LoadMetadataDetails newMetaEntry, * @throws IOException */ public static boolean recordNewLoadMetadata(LoadMetadataDetails newMetaEntry, - CarbonLoadModel loadModel, boolean loadStartEntry, boolean insertOverwrite, String uuid) + final CarbonLoadModel loadModel, boolean loadStartEntry, boolean insertOverwrite, String uuid) throws IOException { + if (loadModel.isCarbonNonTransactionalTable()) { --- End diff -- Done --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2177#discussion_r183085076 --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java --- @@ -144,12 +144,12 @@ private boolean hasDataMapSchema; /** - * The boolean field which points if the data written for UnManaged Table - * or Managed Table. The difference between managed and unManaged table is - * unManaged Table will not contain any Metadata folder and subsequently + * The boolean field which points if the data written for Non Transactional Table + * or Transactional Table. The difference between Transactional and non Transactional table is + * non Transactional Table will not contain any Metadata folder and subsequently * no TableStatus or Schema files. */ - private boolean isUnManagedTable; + private boolean isNonTransactionalTable; --- End diff -- change to `isTransactional` --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2177#discussion_r183085758 --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java --- @@ -144,12 +144,12 @@ private boolean hasDataMapSchema; /** - * The boolean field which points if the data written for UnManaged Table - * or Managed Table. The difference between managed and unManaged table is - * unManaged Table will not contain any Metadata folder and subsequently + * The boolean field which points if the data written for Non Transactional Table + * or Transactional Table. The difference between Transactional and non Transactional table is + * non Transactional Table will not contain any Metadata folder and subsequently --- End diff -- Mention that `transactional table means carbon will provide transactional support when user doing data management like data loading, whether it is success or failure, data will be in consistent state` --- |
In reply to this post by qiuchenjian-2
Github user sounakr commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2177#discussion_r183204286 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/NewCarbonDataLoadRDD.scala --- @@ -381,8 +381,10 @@ class NewDataFrameLoaderRDD[K, V]( carbonLoadModel.getTableName + CarbonCommonConstants.UNDERSCORE + theSplit.index try { - loadMetadataDetails.setPartitionCount(CarbonTablePath.DEPRECATED_PATITION_ID) - loadMetadataDetails.setSegmentStatus(SegmentStatus.SUCCESS) + if (!carbonLoadModel.isCarbonNonTransactionalTable) { --- End diff -- Done --- |
In reply to this post by qiuchenjian-2
Github user sounakr commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2177#discussion_r183204292 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/NewCarbonDataLoadRDD.scala --- @@ -425,9 +427,11 @@ class NewDataFrameLoaderRDD[K, V]( TableProcessingOperations.deleteLocalDataLoadFolderLocation(model, false, false) // in case of failure the same operation will be re-tried several times. // So print the data load statistics only in case of non failure case - if (SegmentStatus.LOAD_FAILURE != loadMetadataDetails.getSegmentStatus) { - CarbonTimeStatisticsFactory.getLoadStatisticsInstance - .printStatisticsInfo(CarbonTablePath.DEPRECATED_PATITION_ID) + if (!carbonLoadModel.isCarbonNonTransactionalTable) { --- End diff -- Done --- |
In reply to this post by qiuchenjian-2
Github user sounakr commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2177#discussion_r183204294 --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java --- @@ -144,12 +144,12 @@ private boolean hasDataMapSchema; /** - * The boolean field which points if the data written for UnManaged Table - * or Managed Table. The difference between managed and unManaged table is - * unManaged Table will not contain any Metadata folder and subsequently + * The boolean field which points if the data written for Non Transactional Table + * or Transactional Table. The difference between Transactional and non Transactional table is + * non Transactional Table will not contain any Metadata folder and subsequently * no TableStatus or Schema files. */ - private boolean isUnManagedTable; + private boolean isNonTransactionalTable; --- End diff -- Done --- |
Free forum by Nabble | Edit this page |