[GitHub] [carbondata] jackylk opened a new pull request #3561: [HOTFIX] Fix INSERT STAGE footer read error

classic Classic list List threaded Threaded
34 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk opened a new pull request #3561: [HOTFIX] Fix INSERT STAGE footer read error

GitBox
jackylk opened a new pull request #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561
 
 
    ### Why is this PR needed?
    There are 2 issues:
   1. INSERT STAGE will skip bytes to read the last long in the carbondata file as the offset to the footer, but sometimes DFS is not skipping the right byte size
   2. When error occurs, segment status is not in SUCCESS state, but recovery handling is not doing correctly in  INSERT STAGE command
   
    ### What changes were proposed in this PR?
   1. In CarbonFooterReaderV3.readFooterVersion3, added a while loop to skip required bytes
   2. In CarbonInsertFromStageCommand, when recovery is required, delete the segment entry if the entry is not in SUCESS state.
       
    ### Does this PR introduce any user interface change?
   No
   
   
    ### Is any new testcase added?
   No
   
   
       
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error

GitBox
CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#issuecomment-570779633
 
 
   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1459/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#issuecomment-570792363
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1461/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#issuecomment-570795547
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1462/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#issuecomment-570878021
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1470/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error

GitBox
In reply to this post by GitBox
jackylk commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#issuecomment-570883085
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#issuecomment-570890128
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1471/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error

GitBox
In reply to this post by GitBox
jackylk commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#issuecomment-570899691
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#issuecomment-570906477
 
 
   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1479/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] niuge01 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error

GitBox
In reply to this post by GitBox
niuge01 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#issuecomment-571403849
 
 
   LGTM

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3561: [HOTFIX] Fix INSERT STAGE footer read error

GitBox
In reply to this post by GitBox
QiangCai commented on a change in pull request #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#discussion_r363613653
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/reader/CarbonFooterReaderV3.java
 ##########
 @@ -59,17 +55,20 @@ public FileFooter3 readFooterVersion3() throws IOException {
     thriftReader.open();
 
     // If footer offset is 0, means caller does not set it,
-    // so we read it from the end of the file
+    // so we read it from the index file
     if (footerOffset == 0) {
 
 Review comment:
   The function of this method should be single, it should only support reading footer.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3561: [HOTFIX] Fix INSERT STAGE footer read error

GitBox
In reply to this post by GitBox
QiangCai commented on a change in pull request #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#discussion_r363614028
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/reader/CarbonFooterReaderV3.java
 ##########
 @@ -59,17 +55,20 @@ public FileFooter3 readFooterVersion3() throws IOException {
     thriftReader.open();
 
     // If footer offset is 0, means caller does not set it,
-    // so we read it from the end of the file
+    // so we read it from the index file
     if (footerOffset == 0) {
 
 Review comment:
   if it needs to read the index, better to implement in another place where it decides to read index or footer.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#issuecomment-571916593
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1526/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#issuecomment-571917785
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1527/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#issuecomment-571957587
 
 
   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1530/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] qiuchenjian commented on a change in pull request #3561: [HOTFIX] Fix INSERT STAGE footer read error

GitBox
In reply to this post by GitBox
qiuchenjian commented on a change in pull request #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#discussion_r364137100
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/util/path/CarbonTablePath.java
 ##########
 @@ -714,4 +714,16 @@ public static String generateBadRecordsPath(String badLogStoreLocation, String s
           CarbonCommonConstants.FILE_SEPARATOR + taskNo;
     }
   }
+
+  /**
+   * Return the parent path of the input file
+   */
+  public static String getParentPath(String dataFilePath) {
+    int endIndex = dataFilePath.lastIndexOf(CarbonCommonConstants.FILE_SEPARATOR);
 
 Review comment:
   CarbonCommonConstants.FILE_SEPARATOR is the last character of dataFilePath, such as '/user/warehouse/carbon.store/www/'

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kunal642 commented on a change in pull request #3561: [HOTFIX] Fix INSERT STAGE footer read error

GitBox
In reply to this post by GitBox
kunal642 commented on a change in pull request #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#discussion_r364329697
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/reader/CarbonIndexFileReader.java
 ##########
 @@ -112,4 +112,34 @@ public void openThriftReader(byte[] fileData) throws IOException {
   public boolean hasNext() throws IOException {
     return thriftReader.hasNext();
   }
+
+  /**
+   * Return the footer offset of the given dataFileName in the index file
+   *
+   * @param indexFilePath path of the index file
+   * @param dataFileName file name of the data file to match in the index file
+   * @param conf hadoop configuration
+   * @return footer offset of the data file
+   * @throws IOException
+   */
+  public static long readFooterOffsetFromIndexFile(String indexFilePath, String dataFileName,
 
 Review comment:
   This method is unused..Please remove if not needed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3561: [HOTFIX] Fix INSERT STAGE footer read error

GitBox
In reply to this post by GitBox
jackylk commented on a change in pull request #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#discussion_r364532329
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/reader/CarbonIndexFileReader.java
 ##########
 @@ -112,4 +112,34 @@ public void openThriftReader(byte[] fileData) throws IOException {
   public boolean hasNext() throws IOException {
     return thriftReader.hasNext();
   }
+
+  /**
+   * Return the footer offset of the given dataFileName in the index file
+   *
+   * @param indexFilePath path of the index file
+   * @param dataFileName file name of the data file to match in the index file
+   * @param conf hadoop configuration
+   * @return footer offset of the data file
+   * @throws IOException
+   */
+  public static long readFooterOffsetFromIndexFile(String indexFilePath, String dataFileName,
 
 Review comment:
   fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3561: [HOTFIX] Fix INSERT STAGE footer read error

GitBox
In reply to this post by GitBox
jackylk commented on a change in pull request #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#discussion_r364532478
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/util/path/CarbonTablePath.java
 ##########
 @@ -714,4 +714,16 @@ public static String generateBadRecordsPath(String badLogStoreLocation, String s
           CarbonCommonConstants.FILE_SEPARATOR + taskNo;
     }
   }
+
+  /**
+   * Return the parent path of the input file
+   */
+  public static String getParentPath(String dataFilePath) {
+    int endIndex = dataFilePath.lastIndexOf(CarbonCommonConstants.FILE_SEPARATOR);
 
 Review comment:
   the input is a file path like `/user/warehouse/carbon.store/www/file.carbondata`, so this function will return `/user/warehouse/carbon.store/www/`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3561: [HOTFIX] Fix INSERT STAGE footer read error

GitBox
In reply to this post by GitBox
jackylk commented on a change in pull request #3561: [HOTFIX] Fix INSERT STAGE footer read error
URL: https://github.com/apache/carbondata/pull/3561#discussion_r364532478
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/util/path/CarbonTablePath.java
 ##########
 @@ -714,4 +714,16 @@ public static String generateBadRecordsPath(String badLogStoreLocation, String s
           CarbonCommonConstants.FILE_SEPARATOR + taskNo;
     }
   }
+
+  /**
+   * Return the parent path of the input file
+   */
+  public static String getParentPath(String dataFilePath) {
+    int endIndex = dataFilePath.lastIndexOf(CarbonCommonConstants.FILE_SEPARATOR);
 
 Review comment:
   the input is a file path like `/user/warehouse/carbon.store/www/file.carbondata`, so this function will return `/user/warehouse/carbon.store/www`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
12