[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4072: [CARBONDATA-4110] Support clean files dry run operation and show statistics after clean files operation
Posted by
GitBox on
Feb 17, 2021; 9:35am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/GitHub-carbondata-vikramahuja1001-opened-a-new-pull-request-4072-WIP-Clean-files-phase2-tp105322p106249.html
ajantha-bhat commented on a change in pull request #4072:
URL:
https://github.com/apache/carbondata/pull/4072#discussion_r577458113##########
File path: core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentStatusManager.java
##########
@@ -1297,4 +1359,37 @@ public static TableStatusReturnTuple separateVisibleAndInvisibleSegments(
return new HashMap<>(0);
}
}
+
+ public static long partitionTableSegmentSize(CarbonTable carbonTable, LoadMetadataDetails
Review comment:
I am thinking now all the clean file operations will become slow because of these size calculation code, which need to interact with the file system.
so, can we can some option as `summary = false`, which won't do any new size calculation operation and clean the files faster ?? @akashrn5 , @QiangCai what you think ?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[hidden email]