[GitHub] [carbondata] kunal642 commented on a change in pull request #3474: [CARBONDATA-3592] Fix query on bloom in case of multiple data files in one segment

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kunal642 commented on a change in pull request #3474: [CARBONDATA-3592] Fix query on bloom in case of multiple data files in one segment

GitBox
kunal642 commented on a change in pull request #3474: [CARBONDATA-3592] Fix query on bloom in case of multiple data files in one segment
URL: https://github.com/apache/carbondata/pull/3474#discussion_r365541908
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/datamap/DataMapUtil.java
 ##########
 @@ -145,18 +147,27 @@ public static DataMapJob getEmbeddedJob() {
    * Prune the segments from the already pruned blocklets.
    */
   public static void pruneSegments(List<Segment> segments, List<ExtendedBlocklet> prunedBlocklets) {
-    Set<Segment> validSegments = new HashSet<>();
+    Map<Segment, Set<String>> validSegments = new HashMap<>();
     for (ExtendedBlocklet blocklet : prunedBlocklets) {
-      // Clear the old pruned index files if any present
-      blocklet.getSegment().getFilteredIndexShardNames().clear();
       // Set the pruned index file to the segment
       // for further pruning.
       String shardName = CarbonTablePath.getShardName(blocklet.getFilePath());
-      blocklet.getSegment().setFilteredIndexShardName(shardName);
-      validSegments.add(blocklet.getSegment());
+      // Add the existing shards to corresponding segments
+      Set<String> existingShards = validSegments.get(blocklet.getSegment());
+      if (existingShards == null) {
+        existingShards = new HashSet<>();
+        validSegments.put(blocklet.getSegment(), existingShards);
+      } else {
+        existingShards.add(shardName);
 
 Review comment:
   fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services