Indhumathi27 opened a new pull request #3835: URL: https://github.com/apache/carbondata/pull/3835 ### Why is this PR needed? ### What changes were proposed in this PR? ### Does this PR introduce any user interface change? - No - Yes. (please explain the change and update document) ### Is any new testcase added? - No - Yes ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
CarbonDataQA1 commented on pull request #3835: URL: https://github.com/apache/carbondata/pull/3835#issuecomment-656860644 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3354/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3835: URL: https://github.com/apache/carbondata/pull/3835#issuecomment-656861252 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1613/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3835: URL: https://github.com/apache/carbondata/pull/3835#issuecomment-657220725 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1615/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3835: URL: https://github.com/apache/carbondata/pull/3835#issuecomment-657220952 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3356/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3835: URL: https://github.com/apache/carbondata/pull/3835#issuecomment-657241951 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3357/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3835: URL: https://github.com/apache/carbondata/pull/3835#issuecomment-657242838 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1616/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
akashrn5 commented on pull request #3835: URL: https://github.com/apache/carbondata/pull/3835#issuecomment-657399639 @Indhumathi27 1. I think, this is not specific to cdc, i think problem can come in case of normal partition table insert into flow also when enableDirectlyWriteDataToStorePath is true for writer. Please check that. 2. In case of cdc , we set direct write to hdfs as false before doing insert, please check who made it enabled. if first point is true, you can change the PR description and remove cdc from there. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
Indhumathi27 commented on pull request #3835: URL: https://github.com/apache/carbondata/pull/3835#issuecomment-657410697 @akashrn5 Fixed and updated the PR. PLease review and merge ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
akashrn5 commented on a change in pull request #3835: URL: https://github.com/apache/carbondata/pull/3835#discussion_r453499751 ########## File path: processing/src/main/java/org/apache/carbondata/processing/store/writer/AbstractFactDataWriter.java ########## @@ -318,6 +335,13 @@ public void initializeWriter() throws CarbonDataWriterException { try { FileFactory.mkdirs(model.getCarbonDataDirectoryPath()); if (enableDirectlyWriteDataToStorePath) { + if (model.getTableSpec().getCarbonTable().isHivePartitionTable() && model + .getCarbonDataDirectoryPath().endsWith("tmp")) { + // set carbonData file store path to partition path instead of tmp directory + carbonDataFileStorePath = model.getCarbonDataDirectoryPath() + .substring(0, model.getCarbonDataDirectoryPath().lastIndexOf("/")) + File.separator Review comment: use `CarbonCommonConstans.FILE_SEPARATOR` ########## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/merge/CarbonMergeDataSetCommand.scala ########## @@ -283,6 +284,13 @@ case class CarbonMergeDataSetCommand( val queue = new util.LinkedList[InternalRow]() override def hasNext: Boolean = if (!queue.isEmpty || iter.hasNext) true else { writer.close() + // revert load direct write to store path after IUD Review comment: update the command as its not IUD exactly, its just an insert operation ########## File path: processing/src/main/java/org/apache/carbondata/processing/store/writer/AbstractFactDataWriter.java ########## @@ -281,12 +281,29 @@ protected void commitCurrentFile(boolean copyInCurrentThread) { LOGGER.error(e); } } else { - if (currentFileSize == 0) { - try { + try { + if (currentFileSize == 0) { handleEmptyDataFile(carbonDataFileStorePath); - } catch (IOException e) { - LOGGER.error(e); + } else { + // In case if direct write data to store path is enabled, carbonData files will be written + // directly to partition path. Need to add partition path and carbon data file size info + // to load metrics + if (model.getTableSpec().getCarbonTable().isHivePartitionTable() && model + .getCarbonDataDirectoryPath().endsWith("tmp") && carbonDataFileStorePath Review comment: check for ends with ` .tmp` similar to carbonUtil APIs ########## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/merge/CarbonMergeDataSetCommand.scala ########## @@ -283,6 +284,13 @@ case class CarbonMergeDataSetCommand( val queue = new util.LinkedList[InternalRow]() override def hasNext: Boolean = if (!queue.isEmpty || iter.hasNext) true else { writer.close() + // revert load direct write to store path after IUD + if (null == directlyWriteDataToHdfs) { Review comment: no need of null check, use .getProperty(property_value, default value) , so u can check null check ########## File path: processing/src/main/java/org/apache/carbondata/processing/store/writer/AbstractFactDataWriter.java ########## @@ -440,9 +464,23 @@ protected void writeIndexFile() throws IOException, CarbonDataWriterException { .copyCarbonDataFileToCarbonStorePath(indexFileName, model.getCarbonDataDirectoryPath(), fileSizeInBytes, metrics); FileFactory.deleteFile(indexFileName); + } else if (model.getTableSpec().getCarbonTable().isHivePartitionTable() && model + .getCarbonDataDirectoryPath().endsWith("tmp")) { Review comment: same as above ########## File path: processing/src/main/java/org/apache/carbondata/processing/store/writer/AbstractFactDataWriter.java ########## @@ -318,6 +335,13 @@ public void initializeWriter() throws CarbonDataWriterException { try { FileFactory.mkdirs(model.getCarbonDataDirectoryPath()); if (enableDirectlyWriteDataToStorePath) { + if (model.getTableSpec().getCarbonTable().isHivePartitionTable() && model + .getCarbonDataDirectoryPath().endsWith("tmp")) { Review comment: same as above,` .tmp` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
akashrn5 commented on pull request #3835: URL: https://github.com/apache/carbondata/pull/3835#issuecomment-657441200 @Indhumathi27 please update the PR description, no need to cdc, can say partition issue when direct write to hdfs enabled, and in detailed description part can list issues. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
Indhumathi27 commented on a change in pull request #3835: URL: https://github.com/apache/carbondata/pull/3835#discussion_r453544294 ########## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/merge/CarbonMergeDataSetCommand.scala ########## @@ -283,6 +284,13 @@ case class CarbonMergeDataSetCommand( val queue = new util.LinkedList[InternalRow]() override def hasNext: Boolean = if (!queue.isEmpty || iter.hasNext) true else { writer.close() + // revert load direct write to store path after IUD Review comment: done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
Indhumathi27 commented on a change in pull request #3835: URL: https://github.com/apache/carbondata/pull/3835#discussion_r453544523 ########## File path: processing/src/main/java/org/apache/carbondata/processing/store/writer/AbstractFactDataWriter.java ########## @@ -318,6 +335,13 @@ public void initializeWriter() throws CarbonDataWriterException { try { FileFactory.mkdirs(model.getCarbonDataDirectoryPath()); if (enableDirectlyWriteDataToStorePath) { + if (model.getTableSpec().getCarbonTable().isHivePartitionTable() && model + .getCarbonDataDirectoryPath().endsWith("tmp")) { + // set carbonData file store path to partition path instead of tmp directory + carbonDataFileStorePath = model.getCarbonDataDirectoryPath() + .substring(0, model.getCarbonDataDirectoryPath().lastIndexOf("/")) + File.separator Review comment: done ########## File path: processing/src/main/java/org/apache/carbondata/processing/store/writer/AbstractFactDataWriter.java ########## @@ -318,6 +335,13 @@ public void initializeWriter() throws CarbonDataWriterException { try { FileFactory.mkdirs(model.getCarbonDataDirectoryPath()); if (enableDirectlyWriteDataToStorePath) { + if (model.getTableSpec().getCarbonTable().isHivePartitionTable() && model + .getCarbonDataDirectoryPath().endsWith("tmp")) { Review comment: done ########## File path: processing/src/main/java/org/apache/carbondata/processing/store/writer/AbstractFactDataWriter.java ########## @@ -440,9 +464,23 @@ protected void writeIndexFile() throws IOException, CarbonDataWriterException { .copyCarbonDataFileToCarbonStorePath(indexFileName, model.getCarbonDataDirectoryPath(), fileSizeInBytes, metrics); FileFactory.deleteFile(indexFileName); + } else if (model.getTableSpec().getCarbonTable().isHivePartitionTable() && model + .getCarbonDataDirectoryPath().endsWith("tmp")) { Review comment: done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3835: URL: https://github.com/apache/carbondata/pull/3835#issuecomment-657512354 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1630/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
Indhumathi27 commented on pull request #3835: URL: https://github.com/apache/carbondata/pull/3835#issuecomment-657531694 retest this please ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3835: URL: https://github.com/apache/carbondata/pull/3835#issuecomment-657549525 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3372/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3835: URL: https://github.com/apache/carbondata/pull/3835#issuecomment-657605258 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3377/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3835: URL: https://github.com/apache/carbondata/pull/3835#issuecomment-657606800 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1636/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
Indhumathi27 commented on pull request #3835: URL: https://github.com/apache/carbondata/pull/3835#issuecomment-657607506 retest this please ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3835: URL: https://github.com/apache/carbondata/pull/3835#issuecomment-657681240 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3380/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
Free forum by Nabble | Edit this page |