QiangCai opened a new pull request #3899: URL: https://github.com/apache/carbondata/pull/3899 ### Why is this PR needed? when reading tablestatus file failed, method TableProcessingOperations. deletePartialLoadDataIfExist will delete all related segments. ### What changes were proposed in this PR? 1. check the result of reading tablestatus file, if it is empty, no need to process. 2. re-factory code to avoid invoking getAbsolutePath method too many times. ### Does this PR introduce any user interface change? - No ### Is any new testcase added? - No ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
CarbonDataQA1 commented on pull request #3899: URL: https://github.com/apache/carbondata/pull/3899#issuecomment-680085795 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2122/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3899: URL: https://github.com/apache/carbondata/pull/3899#issuecomment-680086546 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3863/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
QiangCai commented on pull request #3899: URL: https://github.com/apache/carbondata/pull/3899#issuecomment-680701522 @kunal642 please check this modification. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3899: URL: https://github.com/apache/carbondata/pull/3899#issuecomment-680754700 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3872/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3899: URL: https://github.com/apache/carbondata/pull/3899#issuecomment-680755166 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2131/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
Zhangshunyu commented on pull request #3899: URL: https://github.com/apache/carbondata/pull/3899#issuecomment-680768668 LGTM ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
kunal642 commented on a change in pull request #3899: URL: https://github.com/apache/carbondata/pull/3899#discussion_r477173788 ########## File path: integration/spark/src/test/scala/org/apache/carbondata/integration/spark/testsuite/dataload/TestLoadDataGeneral.scala ########## @@ -310,7 +310,8 @@ class TestLoadDataGeneral extends QueryTest with BeforeAndAfterEach { val tableStatusFile = CarbonTablePath.getTableStatusFilePath(carbonTable.getTablePath) FileFactory.getCarbonFile(tableStatusFile).delete() sql("insert into stale values('k')") - checkAnswer(sql("select * from stale"), Row("k")) + // if table lose tablestatus file, the system should keep all data. + checkAnswer(sql("select * from stale"), Seq(Row("k"), Row("k"))) Review comment: I think this behaviour would be harrmful when the segment is marked for delete..If the segment is not removed then the results would be incorrect. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
QiangCai commented on a change in pull request #3899: URL: https://github.com/apache/carbondata/pull/3899#discussion_r477183265 ########## File path: integration/spark/src/test/scala/org/apache/carbondata/integration/spark/testsuite/dataload/TestLoadDataGeneral.scala ########## @@ -310,7 +310,8 @@ class TestLoadDataGeneral extends QueryTest with BeforeAndAfterEach { val tableStatusFile = CarbonTablePath.getTableStatusFilePath(carbonTable.getTablePath) FileFactory.getCarbonFile(tableStatusFile).delete() sql("insert into stale values('k')") - checkAnswer(sql("select * from stale"), Row("k")) + // if table lose tablestatus file, the system should keep all data. + checkAnswer(sql("select * from stale"), Seq(Row("k"), Row("k"))) Review comment: yes 1. now, after removing the tablestatus file, we can consider all segments are successful by default. 2. in the future, we can add a flag file into the segment when we mark the segment to be deleted. the flag file contains a message the segment is mark_for_delete. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
QiangCai commented on a change in pull request #3899: URL: https://github.com/apache/carbondata/pull/3899#discussion_r477191769 ########## File path: integration/spark/src/test/scala/org/apache/carbondata/integration/spark/testsuite/dataload/TestLoadDataGeneral.scala ########## @@ -310,7 +310,8 @@ class TestLoadDataGeneral extends QueryTest with BeforeAndAfterEach { val tableStatusFile = CarbonTablePath.getTableStatusFilePath(carbonTable.getTablePath) FileFactory.getCarbonFile(tableStatusFile).delete() sql("insert into stale values('k')") - checkAnswer(sql("select * from stale"), Row("k")) + // if table lose tablestatus file, the system should keep all data. + checkAnswer(sql("select * from stale"), Seq(Row("k"), Row("k"))) Review comment: for this test case, the result is two rows, the behavior is right. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
kunal642 commented on a change in pull request #3899: URL: https://github.com/apache/carbondata/pull/3899#discussion_r477201000 ########## File path: integration/spark/src/test/scala/org/apache/carbondata/integration/spark/testsuite/dataload/TestLoadDataGeneral.scala ########## @@ -310,7 +310,8 @@ class TestLoadDataGeneral extends QueryTest with BeforeAndAfterEach { val tableStatusFile = CarbonTablePath.getTableStatusFilePath(carbonTable.getTablePath) FileFactory.getCarbonFile(tableStatusFile).delete() sql("insert into stale values('k')") - checkAnswer(sql("select * from stale"), Row("k")) + // if table lose tablestatus file, the system should keep all data. + checkAnswer(sql("select * from stale"), Seq(Row("k"), Row("k"))) Review comment: ok ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
kunal642 commented on pull request #3899: URL: https://github.com/apache/carbondata/pull/3899#issuecomment-680798546 LGTM ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
asfgit closed pull request #3899: URL: https://github.com/apache/carbondata/pull/3899 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
Free forum by Nabble | Edit this page |