jackylk opened a new pull request #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561 ### Why is this PR needed? There are 2 issues: 1. INSERT STAGE will skip bytes to read the last long in the carbondata file as the offset to the footer, but sometimes DFS is not skipping the right byte size 2. When error occurs, segment status is not in SUCCESS state, but recovery handling is not doing correctly in INSERT STAGE command ### What changes were proposed in this PR? 1. In CarbonFooterReaderV3.readFooterVersion3, added a while loop to skip required bytes 2. In CarbonInsertFromStageCommand, when recovery is required, delete the segment entry if the entry is not in SUCESS state. ### Does this PR introduce any user interface change? No ### Is any new testcase added? No ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#issuecomment-570779633 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1459/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#issuecomment-570792363 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1461/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#issuecomment-570795547 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1462/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#issuecomment-570878021 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1470/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#issuecomment-570883085 retest this please ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#issuecomment-570890128 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1471/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#issuecomment-570899691 retest this please ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#issuecomment-570906477 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1479/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
niuge01 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#issuecomment-571403849 LGTM ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
QiangCai commented on a change in pull request #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#discussion_r363613653 ########## File path: core/src/main/java/org/apache/carbondata/core/reader/CarbonFooterReaderV3.java ########## @@ -59,17 +55,20 @@ public FileFooter3 readFooterVersion3() throws IOException { thriftReader.open(); // If footer offset is 0, means caller does not set it, - // so we read it from the end of the file + // so we read it from the index file if (footerOffset == 0) { Review comment: The function of this method should be single, it should only support reading footer. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
QiangCai commented on a change in pull request #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#discussion_r363614028 ########## File path: core/src/main/java/org/apache/carbondata/core/reader/CarbonFooterReaderV3.java ########## @@ -59,17 +55,20 @@ public FileFooter3 readFooterVersion3() throws IOException { thriftReader.open(); // If footer offset is 0, means caller does not set it, - // so we read it from the end of the file + // so we read it from the index file if (footerOffset == 0) { Review comment: if it needs to read the index, better to implement in another place where it decides to read index or footer. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#issuecomment-571916593 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1526/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#issuecomment-571917785 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1527/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#issuecomment-571957587 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1530/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
qiuchenjian commented on a change in pull request #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#discussion_r364137100 ########## File path: core/src/main/java/org/apache/carbondata/core/util/path/CarbonTablePath.java ########## @@ -714,4 +714,16 @@ public static String generateBadRecordsPath(String badLogStoreLocation, String s CarbonCommonConstants.FILE_SEPARATOR + taskNo; } } + + /** + * Return the parent path of the input file + */ + public static String getParentPath(String dataFilePath) { + int endIndex = dataFilePath.lastIndexOf(CarbonCommonConstants.FILE_SEPARATOR); Review comment: CarbonCommonConstants.FILE_SEPARATOR is the last character of dataFilePath, such as '/user/warehouse/carbon.store/www/' ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
kunal642 commented on a change in pull request #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#discussion_r364329697 ########## File path: core/src/main/java/org/apache/carbondata/core/reader/CarbonIndexFileReader.java ########## @@ -112,4 +112,34 @@ public void openThriftReader(byte[] fileData) throws IOException { public boolean hasNext() throws IOException { return thriftReader.hasNext(); } + + /** + * Return the footer offset of the given dataFileName in the index file + * + * @param indexFilePath path of the index file + * @param dataFileName file name of the data file to match in the index file + * @param conf hadoop configuration + * @return footer offset of the data file + * @throws IOException + */ + public static long readFooterOffsetFromIndexFile(String indexFilePath, String dataFileName, Review comment: This method is unused..Please remove if not needed ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#discussion_r364532329 ########## File path: core/src/main/java/org/apache/carbondata/core/reader/CarbonIndexFileReader.java ########## @@ -112,4 +112,34 @@ public void openThriftReader(byte[] fileData) throws IOException { public boolean hasNext() throws IOException { return thriftReader.hasNext(); } + + /** + * Return the footer offset of the given dataFileName in the index file + * + * @param indexFilePath path of the index file + * @param dataFileName file name of the data file to match in the index file + * @param conf hadoop configuration + * @return footer offset of the data file + * @throws IOException + */ + public static long readFooterOffsetFromIndexFile(String indexFilePath, String dataFileName, Review comment: fixed ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#discussion_r364532478 ########## File path: core/src/main/java/org/apache/carbondata/core/util/path/CarbonTablePath.java ########## @@ -714,4 +714,16 @@ public static String generateBadRecordsPath(String badLogStoreLocation, String s CarbonCommonConstants.FILE_SEPARATOR + taskNo; } } + + /** + * Return the parent path of the input file + */ + public static String getParentPath(String dataFilePath) { + int endIndex = dataFilePath.lastIndexOf(CarbonCommonConstants.FILE_SEPARATOR); Review comment: the input is a file path like `/user/warehouse/carbon.store/www/file.carbondata`, so this function will return `/user/warehouse/carbon.store/www/` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#discussion_r364532478 ########## File path: core/src/main/java/org/apache/carbondata/core/util/path/CarbonTablePath.java ########## @@ -714,4 +714,16 @@ public static String generateBadRecordsPath(String badLogStoreLocation, String s CarbonCommonConstants.FILE_SEPARATOR + taskNo; } } + + /** + * Return the parent path of the input file + */ + public static String getParentPath(String dataFilePath) { + int endIndex = dataFilePath.lastIndexOf(CarbonCommonConstants.FILE_SEPARATOR); Review comment: the input is a file path like `/user/warehouse/carbon.store/www/file.carbondata`, so this function will return `/user/warehouse/carbon.store/www` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
Free forum by Nabble | Edit this page |