[GitHub] [carbondata] QiangCai opened a new pull request #3899: [HOTFIX] avoid to delete all segments when reading tablestatus failed

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai opened a new pull request #3899: [HOTFIX] avoid to delete all segments when reading tablestatus failed

GitBox

QiangCai opened a new pull request #3899:
URL: https://github.com/apache/carbondata/pull/3899


    ### Why is this PR needed?
    when reading tablestatus file failed,  method TableProcessingOperations. deletePartialLoadDataIfExist will delete all related segments.
   
    ### What changes were proposed in this PR?
   1. check the result of reading tablestatus file, if it is empty, no need to process.
   2. re-factory code to avoid invoking getAbsolutePath method too many times.
       
    ### Does this PR introduce any user interface change?
    - No
   
    ### Is any new testcase added?
    - No
   
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3899: [HOTFIX] Aviod cleaning segments after reading tablestatus failed

GitBox

CarbonDataQA1 commented on pull request #3899:
URL: https://github.com/apache/carbondata/pull/3899#issuecomment-680085795


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2122/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3899: [HOTFIX] Aviod cleaning segments after reading tablestatus failed

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3899:
URL: https://github.com/apache/carbondata/pull/3899#issuecomment-680086546


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3863/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on pull request #3899: [WIP] Aviod cleaning segments after reading tablestatus failed

GitBox
In reply to this post by GitBox

QiangCai commented on pull request #3899:
URL: https://github.com/apache/carbondata/pull/3899#issuecomment-680701522


   @kunal642 please check this modification.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3899: [HOTFIX] Aviod cleaning segments after reading tablestatus failed

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3899:
URL: https://github.com/apache/carbondata/pull/3899#issuecomment-680754700


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3872/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3899: [HOTFIX] Aviod cleaning segments after reading tablestatus failed

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3899:
URL: https://github.com/apache/carbondata/pull/3899#issuecomment-680755166


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2131/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Zhangshunyu commented on pull request #3899: [HOTFIX] Aviod cleaning segments after reading tablestatus failed

GitBox
In reply to this post by GitBox

Zhangshunyu commented on pull request #3899:
URL: https://github.com/apache/carbondata/pull/3899#issuecomment-680768668


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kunal642 commented on a change in pull request #3899: [HOTFIX] Aviod cleaning segments after reading tablestatus failed

GitBox
In reply to this post by GitBox

kunal642 commented on a change in pull request #3899:
URL: https://github.com/apache/carbondata/pull/3899#discussion_r477173788



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/integration/spark/testsuite/dataload/TestLoadDataGeneral.scala
##########
@@ -310,7 +310,8 @@ class TestLoadDataGeneral extends QueryTest with BeforeAndAfterEach {
     val tableStatusFile = CarbonTablePath.getTableStatusFilePath(carbonTable.getTablePath)
     FileFactory.getCarbonFile(tableStatusFile).delete()
     sql("insert into stale values('k')")
-    checkAnswer(sql("select * from stale"), Row("k"))
+    // if table lose tablestatus file, the system should keep all data.
+    checkAnswer(sql("select * from stale"), Seq(Row("k"), Row("k")))

Review comment:
       I think this behaviour would be harrmful when the segment is marked for delete..If the segment is not removed then the results would be incorrect.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3899: [HOTFIX] Aviod cleaning segments after reading tablestatus failed

GitBox
In reply to this post by GitBox

QiangCai commented on a change in pull request #3899:
URL: https://github.com/apache/carbondata/pull/3899#discussion_r477183265



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/integration/spark/testsuite/dataload/TestLoadDataGeneral.scala
##########
@@ -310,7 +310,8 @@ class TestLoadDataGeneral extends QueryTest with BeforeAndAfterEach {
     val tableStatusFile = CarbonTablePath.getTableStatusFilePath(carbonTable.getTablePath)
     FileFactory.getCarbonFile(tableStatusFile).delete()
     sql("insert into stale values('k')")
-    checkAnswer(sql("select * from stale"), Row("k"))
+    // if table lose tablestatus file, the system should keep all data.
+    checkAnswer(sql("select * from stale"), Seq(Row("k"), Row("k")))

Review comment:
       yes
   1. now, after removing the tablestatus file,  we can consider all segments are successful by default.
   2. in the future, we can add a flag file into the segment when we mark the segment to be deleted.
      the flag file contains a message the segment is mark_for_delete.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3899: [HOTFIX] Aviod cleaning segments after reading tablestatus failed

GitBox
In reply to this post by GitBox

QiangCai commented on a change in pull request #3899:
URL: https://github.com/apache/carbondata/pull/3899#discussion_r477191769



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/integration/spark/testsuite/dataload/TestLoadDataGeneral.scala
##########
@@ -310,7 +310,8 @@ class TestLoadDataGeneral extends QueryTest with BeforeAndAfterEach {
     val tableStatusFile = CarbonTablePath.getTableStatusFilePath(carbonTable.getTablePath)
     FileFactory.getCarbonFile(tableStatusFile).delete()
     sql("insert into stale values('k')")
-    checkAnswer(sql("select * from stale"), Row("k"))
+    // if table lose tablestatus file, the system should keep all data.
+    checkAnswer(sql("select * from stale"), Seq(Row("k"), Row("k")))

Review comment:
       for this test case, the result is two rows,  the behavior is right.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kunal642 commented on a change in pull request #3899: [HOTFIX] Aviod cleaning segments after reading tablestatus failed

GitBox
In reply to this post by GitBox

kunal642 commented on a change in pull request #3899:
URL: https://github.com/apache/carbondata/pull/3899#discussion_r477201000



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/integration/spark/testsuite/dataload/TestLoadDataGeneral.scala
##########
@@ -310,7 +310,8 @@ class TestLoadDataGeneral extends QueryTest with BeforeAndAfterEach {
     val tableStatusFile = CarbonTablePath.getTableStatusFilePath(carbonTable.getTablePath)
     FileFactory.getCarbonFile(tableStatusFile).delete()
     sql("insert into stale values('k')")
-    checkAnswer(sql("select * from stale"), Row("k"))
+    // if table lose tablestatus file, the system should keep all data.
+    checkAnswer(sql("select * from stale"), Seq(Row("k"), Row("k")))

Review comment:
       ok




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kunal642 commented on pull request #3899: [HOTFIX] Aviod cleaning segments after reading tablestatus failed

GitBox
In reply to this post by GitBox

kunal642 commented on pull request #3899:
URL: https://github.com/apache/carbondata/pull/3899#issuecomment-680798546


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] asfgit closed pull request #3899: [HOTFIX] Aviod cleaning segments after reading tablestatus failed

GitBox
In reply to this post by GitBox

asfgit closed pull request #3899:
URL: https://github.com/apache/carbondata/pull/3899


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]