Login  Register

[GitHub] [carbondata] akashrn5 commented on a change in pull request #4013: [CARBONDATA-4062] Make clean files as data trash manager

Posted by GitBox on Dec 03, 2020; 3:36pm
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/GitHub-carbondata-QiangCai-opened-a-new-pull-request-4013-WIP-Remove-automatically-clean-data-tp103289p104206.html


akashrn5 commented on a change in pull request #4013:
URL: https://github.com/apache/carbondata/pull/4013#discussion_r535339696



##########
File path: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonTableCompactor.scala
##########
@@ -90,7 +90,6 @@ class CarbonTableCompactor(carbonLoadModel: CarbonLoadModel,
 
     while (loadsToMerge.size() > 1 || needSortSingleSegment(loadsToMerge)) {
       val lastSegment = sortedSegments.get(sortedSegments.size() - 1)
-      deletePartialLoadsInCompaction()

Review comment:
       but the stale data will always be present inside segment folder right? by chance assume if the segment file is corrupted or deleted, still carbon should pass the query, which it does by listing, in that case we will get the wrong data or query will always fail. I think we need to have some way to clean them, what you think @QiangCai @ajantha-bhat




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]