Login  Register

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4072: [CARBONDATA-4110] Support clean files dry run operation and show statistics after clean files operation

Posted by GitBox on Feb 17, 2021; 9:38am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/GitHub-carbondata-vikramahuja1001-opened-a-new-pull-request-4072-WIP-Clean-files-phase2-tp105322p106250.html


ajantha-bhat commented on a change in pull request #4072:
URL: https://github.com/apache/carbondata/pull/4072#discussion_r577458113



##########
File path: core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentStatusManager.java
##########
@@ -1297,4 +1359,37 @@ public static TableStatusReturnTuple separateVisibleAndInvisibleSegments(
       return new HashMap<>(0);
     }
   }
+
+  public static long partitionTableSegmentSize(CarbonTable carbonTable, LoadMetadataDetails

Review comment:
       I am thinking now all the clean file operations will become slow because of these size calculation code, which need to interact with the file system.
   
   Default we can have this size calculation. but if user wants clean files to be faster. Can we have some option as `summary = false`, which won't do any new size calculation operation and clean the files faster ?? @akashrn5 , @QiangCai what you think ?
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]