ajantha-bhat opened a new pull request #3994: URL: https://github.com/apache/carbondata/pull/3994 ### Why is this PR needed? For compaction, we don't register in-progress segment. so, when unable to get a table status lock. compaction can fail. That time compaction partial segment needs to be cleaned. If the partial segment is failed to clean up due to unable to get lock or IO issues. When the user retries the compaction. carbon uses the same segment id. so while writing the segment file for new compaction. list only the files mapping to the current compaction, not all the files which contain stale files. ### What changes were proposed in this PR? While writing the segment file, consider index files belongs to the current load only in the segment folder. ### Does this PR introduce any user interface change? - No ### Is any new testcase added? - No [As it happens in concurrent scenario randomly, manually verified] ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
ajantha-bhat commented on pull request #3994: URL: https://github.com/apache/carbondata/pull/3994#issuecomment-714445721 @QiangCai : please check this. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3994: URL: https://github.com/apache/carbondata/pull/3994#issuecomment-714556030 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2876/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3994: URL: https://github.com/apache/carbondata/pull/3994#issuecomment-714569205 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4632/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on pull request #3994: URL: https://github.com/apache/carbondata/pull/3994#issuecomment-714616941 retest this please ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3994: URL: https://github.com/apache/carbondata/pull/3994#issuecomment-714671495 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4640/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3994: URL: https://github.com/apache/carbondata/pull/3994#issuecomment-714711458 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2884/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
QiangCai commented on a change in pull request #3994: URL: https://github.com/apache/carbondata/pull/3994#discussion_r510548190 ########## File path: core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java ########## @@ -398,27 +398,29 @@ public static void mergeIndexAndWriteSegmentFile(CarbonTable carbonTable, String * @throws IOException */ public static String writeSegmentFile(CarbonTable carbonTable, String segmentId, String UUID, Review comment: for the update, it will have more than one timstamp, right ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
QiangCai commented on a change in pull request #3994: URL: https://github.com/apache/carbondata/pull/3994#discussion_r510548190 ########## File path: core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java ########## @@ -398,27 +398,29 @@ public static void mergeIndexAndWriteSegmentFile(CarbonTable carbonTable, String * @throws IOException */ public static String writeSegmentFile(CarbonTable carbonTable, String segmentId, String UUID, Review comment: for the update, it will have more than one timestamp, right ########## File path: core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java ########## @@ -398,27 +398,29 @@ public static void mergeIndexAndWriteSegmentFile(CarbonTable carbonTable, String * @throws IOException */ public static String writeSegmentFile(CarbonTable carbonTable, String segmentId, String UUID, Review comment: for the update, it will have more than one timestamp, right? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3994: URL: https://github.com/apache/carbondata/pull/3994#discussion_r510596257 ########## File path: core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java ########## @@ -398,27 +398,29 @@ public static void mergeIndexAndWriteSegmentFile(CarbonTable carbonTable, String * @throws IOException */ public static String writeSegmentFile(CarbonTable carbonTable, String segmentId, String UUID, Review comment: Agree, Let me check and modify ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3994: URL: https://github.com/apache/carbondata/pull/3994#discussion_r510671271 ########## File path: core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java ########## @@ -398,27 +398,29 @@ public static void mergeIndexAndWriteSegmentFile(CarbonTable carbonTable, String * @throws IOException */ public static String writeSegmentFile(CarbonTable carbonTable, String segmentId, String UUID, Review comment: I have pushed now. waiting for the build ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3994: URL: https://github.com/apache/carbondata/pull/3994#issuecomment-715171177 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2903/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3994: URL: https://github.com/apache/carbondata/pull/3994#issuecomment-715172063 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4659/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3994: URL: https://github.com/apache/carbondata/pull/3994#discussion_r510987950 ########## File path: core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java ########## @@ -398,27 +398,29 @@ public static void mergeIndexAndWriteSegmentFile(CarbonTable carbonTable, String * @throws IOException */ public static String writeSegmentFile(CarbonTable carbonTable, String segmentId, String UUID, Review comment: @QiangCai : I have thought about, only non update scenario I can handle this issue. For update there is no easy way currently to find out which is stale and which is not. One way for update is to read old segment file and add files that timestamp greater than old segment file content + old segment file content. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
marchpure commented on pull request #3994: URL: https://github.com/apache/carbondata/pull/3994#issuecomment-716933491 Hi, Ajantha. I have fixed this in #3999 , please have a check. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on pull request #3994: URL: https://github.com/apache/carbondata/pull/3994#issuecomment-716998184 > Hi, Ajantha. I have fixed this in #3999 , please have a check. @marchpure: I think your PR is not handling all the scenarios, so update scenario and test case will fail (even mine also) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
marchpure commented on pull request #3994: URL: https://github.com/apache/carbondata/pull/3994#issuecomment-717010368 > > Hi, Ajantha. I have fixed this in #3999 , please have a check. > > @marchpure: I think your PR is not handling all the scenarios, so update scenario and test case will fail (even mine also) yes. update test case all passes i am handing other 10 test failures ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on pull request #3994: URL: https://github.com/apache/carbondata/pull/3994#issuecomment-717014461 @marchpure : If you dont assign UUID for all update case or compaction after update case, all testcases will pass. But the original data mismatch will still happen for segment that has stale files and went for update and compaction (that scenario you can test) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
marchpure commented on pull request #3994: URL: https://github.com/apache/carbondata/pull/3994#issuecomment-717015851 > @marchpure : If you dont assign UUID for all update case or compaction after update case, all testcases will pass. But the original data mismatch will still happen for segment that has stale files and went for update and compaction (that scenario you can test) yes. I have test it. update will generate new segment like merge into. which can void recreate segment file. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
marchpure edited a comment on pull request #3994: URL: https://github.com/apache/carbondata/pull/3994#issuecomment-717015851 > @marchpure : If you dont assign UUID for all update case or compaction after update case, all testcases will pass. But the original data mismatch will still happen for segment that has stale files and went for update and compaction (that scenario you can test) yes. I have test it. in pr3999, update will generate new segment like merge into. which can void recreate segment file. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
Free forum by Nabble | Edit this page |