akashrn5 commented on a change in pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#discussion_r503036710 ########## File path: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ########## @@ -267,9 +266,8 @@ object CarbonDataRDDFactory { throw new Exception("Exception in compaction " + exception.getMessage) } } finally { - executor.shutdownNow() try { - compactor.deletePartialLoadsInCompaction() Review comment: @QiangCai how its handled now, without list files? why cant we do list files with the timestamp filter, which is load timestamp/fact timestamp, we can get from load model or somewhere right? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
akashrn5 commented on a change in pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#discussion_r503037126 ########## File path: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ########## @@ -267,9 +266,8 @@ object CarbonDataRDDFactory { throw new Exception("Exception in compaction " + exception.getMessage) } } finally { - executor.shutdownNow() try { - compactor.deletePartialLoadsInCompaction() Review comment: > We cannot remove the clean stale files in case of IUD and wait for clean files command to clean them, we should immediately clean the stale ones in the respective command itself, as there will be chances of extra data or data inconsistency. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
akashrn5 commented on a change in pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#discussion_r503037126 ########## File path: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ########## @@ -267,9 +266,8 @@ object CarbonDataRDDFactory { throw new Exception("Exception in compaction " + exception.getMessage) } } finally { - executor.shutdownNow() try { - compactor.deletePartialLoadsInCompaction() Review comment: @Pickupolddriver We cannot remove the clean stale files in case of IUD and wait for clean files command to clean them, we should immediately clean the stale ones in the respective command itself, as there will be chances of extra data or data inconsistency. @QiangCai we can avoid this may be once we implement the writing the update data to new segment and writing only the delete delta files to the updated segment. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#issuecomment-706873519 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4363/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#issuecomment-706875962 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2613/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#issuecomment-707009914 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2625/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#issuecomment-707031327 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4376/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
QiangCai commented on a change in pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#discussion_r503629383 ########## File path: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ########## @@ -267,9 +266,8 @@ object CarbonDataRDDFactory { throw new Exception("Exception in compaction " + exception.getMessage) } } finally { - executor.shutdownNow() try { - compactor.deletePartialLoadsInCompaction() Review comment: @akashrn5 After we avoid using listFiles during loading, the stale segment (for example 0.1) will not impact data consistency. If the stale segment has stale index files and data files, we will not add them to segment file. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
QiangCai commented on a change in pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#discussion_r503631339 ########## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForDeleteCommand.scala ########## @@ -108,9 +108,6 @@ private[sql] case class CarbonProjectForDeleteCommand( } val executorErrors = ExecutionErrors(FailureCauses.NONE, "") - // handle the clean up of IUD. - CarbonUpdateUtil.cleanUpDeltaFiles(carbonTable, false) Review comment: if we don't clean up stale delta files, it will be used after the next update. maybe we can't remove it now. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
Pickupolddriver commented on a change in pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#discussion_r504382333 ########## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForDeleteCommand.scala ########## @@ -108,9 +108,6 @@ private[sql] case class CarbonProjectForDeleteCommand( } val executorErrors = ExecutionErrors(FailureCauses.NONE, "") - // handle the clean up of IUD. - CarbonUpdateUtil.cleanUpDeltaFiles(carbonTable, false) Review comment: Yes, the purpose of PR is to prevent delete data by accident, after checking the code of `cleanUpDeltaFiles` function, I think it's safe to keep it here since it will only delete the files in the aborted scenario and it's crucial for data accuracy. I will revert removing this function. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#issuecomment-708214926 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2672/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#issuecomment-708216427 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4426/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
Pickupolddriver commented on a change in pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#discussion_r504537575 ########## File path: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ########## @@ -267,9 +266,8 @@ object CarbonDataRDDFactory { throw new Exception("Exception in compaction " + exception.getMessage) } } finally { - executor.shutdownNow() try { - compactor.deletePartialLoadsInCompaction() Review comment: @ajantha-bhat after the merge of #3978 , cleanStaleDeltaFiles will only be called when there are exceptions during the update process. And it will only delete the delta and index file created by the exception update. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#issuecomment-708300620 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4429/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#issuecomment-708341574 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2675/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
akashrn5 commented on a change in pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#discussion_r505226878 ########## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForDeleteCommand.scala ########## @@ -149,14 +149,10 @@ private[sql] case class CarbonProjectForDeleteCommand( case e: HorizontalCompactionException => LOGGER.error("Delete operation passed. Exception in Horizontal Compaction." + " Please check logs. " + e.getMessage) - CarbonUpdateUtil.cleanStaleDeltaFiles(carbonTable, e.compactionTimeStamp.toString) Seq(Row(0L)) case e: Exception => LOGGER.error("Exception in Delete data operation " + e.getMessage, e) - // ****** start clean up. Review comment: i dont think we can remove this directly, it might create problem, as mentioned in comments ########## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/DeleteExecution.scala ########## @@ -374,8 +374,6 @@ object DeleteExecution { blockMappingVO.getSegmentNumberOfBlockMapping) } } else { - // In case of failure , clean all related delete delta files - CarbonUpdateUtil.cleanStaleDeltaFiles(carbonTable, timestamp) Review comment: same as above ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#issuecomment-708998997 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4458/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#issuecomment-709021329 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2705/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#issuecomment-709762201 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2717/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#issuecomment-709794950 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4471/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
Free forum by Nabble | Edit this page |