shivamasn commented on a change in pull request #3369: [CARBONDATA-3508] Support CG datamap pruning fallback while querying
URL:
https://github.com/apache/carbondata/pull/3369#discussion_r319359817
##########
File path: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonInputFormat.java
##########
@@ -573,19 +573,27 @@ private int getBlockCount(List<ExtendedBlocklet> blocklets) {
if (cgDataMapExprWrapper != null) {
// Prune segments from already pruned blocklets
DataMapUtil.pruneSegments(segmentIds, prunedBlocklets);
- List<ExtendedBlocklet> cgPrunedBlocklets;
+ List<ExtendedBlocklet> cgPrunedBlocklets = new ArrayList<>();
// Again prune with CG datamap.
if (distributedCG && dataMapJob != null) {
- cgPrunedBlocklets = DataMapUtil
- .executeDataMapJob(carbonTable, filter.getResolver(), dataMapJob, partitionsToPrune,
- segmentIds, invalidSegments, DataMapLevel.CG, true, new ArrayList<String>());
+ try {
+ cgPrunedBlocklets = DataMapUtil
+ .executeDataMapJob(carbonTable, filter.getResolver(), dataMapJob, partitionsToPrune,
+ segmentIds, invalidSegments, DataMapLevel.CG, true, new ArrayList<String>());
+ } catch (Exception e) {
+ LOG.error("CG datamap pruning failed.", e);
+ }
} else {
cgPrunedBlocklets = cgDataMapExprWrapper.prune(segmentIds, partitionsToPrune);
}
- // since index datamap prune in segment scope,
- // the result need to intersect with previous pruned result
- prunedBlocklets =
- intersectFilteredBlocklets(carbonTable, prunedBlocklets, cgPrunedBlocklets);
+ // If cgPrunedBlocklets == 0, it means that CG datamap pruning failed,
Review comment:
@kevinjmh Case 1: If there is no data corresponding to the filter column, then simply it will pass the pruned blocklets from default datamap and will not go for cg pruning.
Case 2: If there is data corresponding to filter column, then cg pruned blocklets should not be 0 and if it is 0, it means that there is some problem in cg pruning, So we do not fail the select query and fallback to default pruning.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[hidden email]
With regards,
Apache Git Services