Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] carbondata pull request #2177: [WIP][Unmanaged Table] Insert into Unmanaged ...

Classic

List

36 messages Options

Options

12

[GitHub] carbondata pull request #2177: [WIP][Unmanaged Table] Insert into Unmanaged ...

GitHub user sounakr opened a pull request:

https://github.com/apache/carbondata/pull/2177

[WIP][Unmanaged Table] Insert into Unmanaged Table

Insert into Unmanaged Table.

- [ ] Any interfaces changed?

- [ ] Any backward compatibility impacted?

- [ ] Document update required?

- [ ] Testing done
Please provide details on
- Whether new unit test cases have been added or why no new tests are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance test report.
- Any additional information to help reviewers in testing this change.

- [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sounakr/incubator-carbondata InsertInto

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2177.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2177

----
commit b22e0052d7f204309b02c11b4b0f7bd685c1e56c
Author: sounakr <sounakr@...>
Date: 2018-04-17T08:08:51Z

Insert into Unmanaged Table

----

---

[GitHub] carbondata issue #2177: [WIP][Unmanaged Table] Insert into Unmanaged Table

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2177

Please change the term "unmanaged table" to "external table"

---

[GitHub] carbondata pull request #2177: [WIP][Unmanaged Table] Insert into Unmanaged ...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2177#discussion_r181993630

--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestUnmanagedCarbonTable.scala ---
@@ -353,13 +389,17 @@ class TestUnmanagedCarbonTable extends QueryTest with BeforeAndAfterAll {
Assert.assertNotEquals(1, dataFiles.length)

sql("DROP TABLE IF EXISTS sdkOutputTable")
+ sql("DROP TABLE IF EXISTS t1")

sql(
s"""CREATE EXTERNAL TABLE sdkOutputTable STORED BY 'carbondata' LOCATION
|'$writerPath' """.stripMargin)

checkAnswer(sql("select count(*) from sdkOutputTable"), Seq(Row(1000000)))

+
+
--- End diff --

remove empty line

---

[GitHub] carbondata issue #2177: [WIP][Unmanaged Table] Insert into Unmanaged Table

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2177

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3877/

---

[GitHub] carbondata issue #2177: [WIP][Unmanaged Table] Insert into Unmanaged Table

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2177

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5099/

---

[GitHub] carbondata issue #2177: [WIP][Unmanaged Table] Insert into Unmanaged Table

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2177

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3889/

---

[GitHub] carbondata issue #2177: [WIP][Unmanaged Table] Insert into Unmanaged Table

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2177

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5111/

---

[GitHub] carbondata issue #2177: [CARBONDATA-2360][Non Transactional Table]Insert int...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2177

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3937/

---

[GitHub] carbondata issue #2177: [CARBONDATA-2360][Non Transactional Table]Insert int...

In reply to this post by qiuchenjian-2

Github user sounakr commented on the issue:

https://github.com/apache/carbondata/pull/2177

Retest this please

---

[GitHub] carbondata issue #2177: [CARBONDATA-2360][Non Transactional Table]Insert int...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2177

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4027/

---

[GitHub] carbondata issue #2177: [CARBONDATA-2360][Non Transactional Table]Insert int...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2177

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5222/

---

[GitHub] carbondata pull request #2177: [CARBONDATA-2360][Non Transactional Table]Ins...

In reply to this post by qiuchenjian-2

Github user gvramana commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2177#discussion_r182968973

--- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/NewCarbonDataLoadRDD.scala ---
@@ -381,8 +381,10 @@ class NewDataFrameLoaderRDD[K, V](
carbonLoadModel.getTableName + CarbonCommonConstants.UNDERSCORE + theSplit.index
try {

- loadMetadataDetails.setPartitionCount(CarbonTablePath.DEPRECATED_PATITION_ID)
- loadMetadataDetails.setSegmentStatus(SegmentStatus.SUCCESS)
+ if (!carbonLoadModel.isCarbonNonTransactionalTable) {
--- End diff --

this check should not be present, only writing to tablestatus needs to be avoided. All the remaining flow should be same.

---

[GitHub] carbondata pull request #2177: [CARBONDATA-2360][Non Transactional Table]Ins...

In reply to this post by qiuchenjian-2

Github user gvramana commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2177#discussion_r182969475

--- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/NewCarbonDataLoadRDD.scala ---
@@ -425,9 +427,11 @@ class NewDataFrameLoaderRDD[K, V](
TableProcessingOperations.deleteLocalDataLoadFolderLocation(model, false, false)
// in case of failure the same operation will be re-tried several times.
// So print the data load statistics only in case of non failure case
- if (SegmentStatus.LOAD_FAILURE != loadMetadataDetails.getSegmentStatus) {
- CarbonTimeStatisticsFactory.getLoadStatisticsInstance
- .printStatisticsInfo(CarbonTablePath.DEPRECATED_PATITION_ID)
+ if (!carbonLoadModel.isCarbonNonTransactionalTable) {
--- End diff --

this check should not be present, logic should be same for transactional and non-transactional tables;

---

[GitHub] carbondata pull request #2177: [CARBONDATA-2360][Non Transactional Table]Ins...

In reply to this post by qiuchenjian-2

Github user gvramana commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2177#discussion_r183043670

--- Diff: processing/src/main/java/org/apache/carbondata/processing/util/CarbonLoaderUtil.java ---
@@ -169,8 +169,28 @@ public static boolean recordNewLoadMetadata(LoadMetadataDetails newMetaEntry,
* @throws IOException
*/
public static boolean recordNewLoadMetadata(LoadMetadataDetails newMetaEntry,
- CarbonLoadModel loadModel, boolean loadStartEntry, boolean insertOverwrite, String uuid)
+ final CarbonLoadModel loadModel, boolean loadStartEntry, boolean insertOverwrite, String uuid)
throws IOException {
+ if (loadModel.isCarbonNonTransactionalTable()) {
--- End diff --

Function name says updation of metadata, so should not delete data in that

---

[GitHub] carbondata pull request #2177: [CARBONDATA-2360][Non Transactional Table]Ins...

In reply to this post by qiuchenjian-2

Github user sounakr commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2177#discussion_r183074326

--- Diff: processing/src/main/java/org/apache/carbondata/processing/util/CarbonLoaderUtil.java ---
@@ -169,8 +169,28 @@ public static boolean recordNewLoadMetadata(LoadMetadataDetails newMetaEntry,
* @throws IOException
*/
public static boolean recordNewLoadMetadata(LoadMetadataDetails newMetaEntry,
- CarbonLoadModel loadModel, boolean loadStartEntry, boolean insertOverwrite, String uuid)
+ final CarbonLoadModel loadModel, boolean loadStartEntry, boolean insertOverwrite, String uuid)
throws IOException {
+ if (loadModel.isCarbonNonTransactionalTable()) {
--- End diff --

Done

---

[GitHub] carbondata pull request #2177: [CARBONDATA-2360][Non Transactional Table]Ins...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2177#discussion_r183085076

--- Diff: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java ---
@@ -144,12 +144,12 @@
private boolean hasDataMapSchema;

/**
- * The boolean field which points if the data written for UnManaged Table
- * or Managed Table. The difference between managed and unManaged table is
- * unManaged Table will not contain any Metadata folder and subsequently
+ * The boolean field which points if the data written for Non Transactional Table
+ * or Transactional Table. The difference between Transactional and non Transactional table is
+ * non Transactional Table will not contain any Metadata folder and subsequently
* no TableStatus or Schema files.
*/
- private boolean isUnManagedTable;
+ private boolean isNonTransactionalTable;
--- End diff --

change to `isTransactional`

---

[GitHub] carbondata pull request #2177: [CARBONDATA-2360][Non Transactional Table]Ins...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2177#discussion_r183085758

--- Diff: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java ---
@@ -144,12 +144,12 @@
private boolean hasDataMapSchema;

/**
- * The boolean field which points if the data written for UnManaged Table
- * or Managed Table. The difference between managed and unManaged table is
- * unManaged Table will not contain any Metadata folder and subsequently
+ * The boolean field which points if the data written for Non Transactional Table
+ * or Transactional Table. The difference between Transactional and non Transactional table is
+ * non Transactional Table will not contain any Metadata folder and subsequently
--- End diff --

Mention that `transactional table means carbon will provide transactional support when user doing data management like data loading, whether it is success or failure, data will be in consistent state`

---

[GitHub] carbondata pull request #2177: [CARBONDATA-2360][Non Transactional Table]Ins...

In reply to this post by qiuchenjian-2

Github user sounakr commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2177#discussion_r183204286

--- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/NewCarbonDataLoadRDD.scala ---
@@ -381,8 +381,10 @@ class NewDataFrameLoaderRDD[K, V](
carbonLoadModel.getTableName + CarbonCommonConstants.UNDERSCORE + theSplit.index
try {

- loadMetadataDetails.setPartitionCount(CarbonTablePath.DEPRECATED_PATITION_ID)
- loadMetadataDetails.setSegmentStatus(SegmentStatus.SUCCESS)
+ if (!carbonLoadModel.isCarbonNonTransactionalTable) {
--- End diff --

Done

---

[GitHub] carbondata pull request #2177: [CARBONDATA-2360][Non Transactional Table]Ins...

In reply to this post by qiuchenjian-2

Github user sounakr commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2177#discussion_r183204292

--- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/NewCarbonDataLoadRDD.scala ---
@@ -425,9 +427,11 @@ class NewDataFrameLoaderRDD[K, V](
TableProcessingOperations.deleteLocalDataLoadFolderLocation(model, false, false)
// in case of failure the same operation will be re-tried several times.
// So print the data load statistics only in case of non failure case
- if (SegmentStatus.LOAD_FAILURE != loadMetadataDetails.getSegmentStatus) {
- CarbonTimeStatisticsFactory.getLoadStatisticsInstance
- .printStatisticsInfo(CarbonTablePath.DEPRECATED_PATITION_ID)
+ if (!carbonLoadModel.isCarbonNonTransactionalTable) {
--- End diff --

Done

---

[GitHub] carbondata pull request #2177: [CARBONDATA-2360][Non Transactional Table]Ins...

In reply to this post by qiuchenjian-2

Github user sounakr commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2177#discussion_r183204294

--- Diff: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java ---
@@ -144,12 +144,12 @@
private boolean hasDataMapSchema;

/**
- * The boolean field which points if the data written for UnManaged Table
- * or Managed Table. The difference between managed and unManaged table is
- * unManaged Table will not contain any Metadata folder and subsequently
+ * The boolean field which points if the data written for Non Transactional Table
+ * or Transactional Table. The difference between Transactional and non Transactional table is
+ * non Transactional Table will not contain any Metadata folder and subsequently
* no TableStatus or Schema files.
*/
- private boolean isUnManagedTable;
+ private boolean isNonTransactionalTable;
--- End diff --

Done

---

12