CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#issuecomment-602183438 Build Failed with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/829/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
Indhumathi27 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#issuecomment-602184282 retest this please ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#issuecomment-602200827 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/830/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#issuecomment-602204009 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2537/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#discussion_r396191624 ########## File path: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java ########## @@ -140,9 +152,28 @@ public DataMapBuilder createBuilder(Segment segment, String shardName, segmentMap.put(segment.getSegmentNo(), segment); Set<TableBlockIndexUniqueIdentifier> identifiers = getTableBlockIndexUniqueIdentifiers(segment); - // get tableBlockIndexUniqueIdentifierWrappers from segment file info - getTableBlockUniqueIdentifierWrappers(partitionsToPrune, - tableBlockIndexUniqueIdentifierWrappers, identifiers); + if (null != partitionsToPrune) { + // get tableBlockIndexUniqueIdentifierWrappers from segment file info + getTableBlockUniqueIdentifierWrappers(partitionsToPrune, + tableBlockIndexUniqueIdentifierWrappers, identifiers); + } else { + SegmentMetaDataInfo segmentMetaDataInfo = segment.getSegmentMetaDataInfo(); Review comment: tableBlockIndexUniqueIdentifierWrappers is just a wrapper around TableBlockIndexUniqueIdentifier. So,we should avoid creating TableBlockIndexUniqueIdentifier itself If that filter doesn't match segment min max. [line 153, should not be called if isScanRequired is false] ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#discussion_r396223698 ########## File path: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java ########## @@ -140,9 +152,28 @@ public DataMapBuilder createBuilder(Segment segment, String shardName, segmentMap.put(segment.getSegmentNo(), segment); Set<TableBlockIndexUniqueIdentifier> identifiers = getTableBlockIndexUniqueIdentifiers(segment); - // get tableBlockIndexUniqueIdentifierWrappers from segment file info - getTableBlockUniqueIdentifierWrappers(partitionsToPrune, - tableBlockIndexUniqueIdentifierWrappers, identifiers); + if (null != partitionsToPrune) { + // get tableBlockIndexUniqueIdentifierWrappers from segment file info + getTableBlockUniqueIdentifierWrappers(partitionsToPrune, + tableBlockIndexUniqueIdentifierWrappers, identifiers); + } else { + SegmentMetaDataInfo segmentMetaDataInfo = segment.getSegmentMetaDataInfo(); Review comment: If we keep SegmentFileStore inside Segment class and read the segment before calling readCommittedScope.committedIndexFiles. we can avoid creating the TableBlockIndexUniqueIdentifier object. May be more changes now. Can do in other PR. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
ajantha-bhat commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#issuecomment-602398097 LGTM ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
Indhumathi27 commented on a change in pull request #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#discussion_r396224060 ########## File path: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java ########## @@ -140,9 +152,28 @@ public DataMapBuilder createBuilder(Segment segment, String shardName, segmentMap.put(segment.getSegmentNo(), segment); Set<TableBlockIndexUniqueIdentifier> identifiers = getTableBlockIndexUniqueIdentifiers(segment); - // get tableBlockIndexUniqueIdentifierWrappers from segment file info - getTableBlockUniqueIdentifierWrappers(partitionsToPrune, - tableBlockIndexUniqueIdentifierWrappers, identifiers); + if (null != partitionsToPrune) { + // get tableBlockIndexUniqueIdentifierWrappers from segment file info + getTableBlockUniqueIdentifierWrappers(partitionsToPrune, + tableBlockIndexUniqueIdentifierWrappers, identifiers); + } else { + SegmentMetaDataInfo segmentMetaDataInfo = segment.getSegmentMetaDataInfo(); Review comment: ok. Will take up in future PR ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
asfgit closed pull request #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
Free forum by Nabble | Edit this page |