[GitHub] [carbondata] dhatchayani commented on a change in pull request #3148: [CARBONDATA-3293] Prune datamaps improvement for count(*)

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] dhatchayani commented on a change in pull request #3148: [CARBONDATA-3293] Prune datamaps improvement for count(*)

GitBox
dhatchayani commented on a change in pull request #3148: [CARBONDATA-3293] Prune datamaps improvement for count(*)
URL: https://github.com/apache/carbondata/pull/3148#discussion_r266757605
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java
 ##########
 @@ -648,6 +652,51 @@ protected int getTotalBlocklets() {
     return sum;
   }
 
+  @Override public Map<String, Long> getRowCount(Segment segment, List<PartitionSpec> partitions,
+      Map<String, Long> blockletToRowCountMap) throws IOException {
+    if (taskSummaryDMStore.getTotalRowCount() == 0) {
+      return getRowCountForEachBlock(segment, partitions, blockletToRowCountMap);
+    } else {
+      if (taskSummaryDMStore.getRowCount() == 0) {
+        return new HashMap<>();
+      }
+      Long rowCount = blockletToRowCountMap.get("RowCount");
+      long totalRowCount = taskSummaryDMStore.getTotalRowCount();
+      if (null == rowCount) {
+        blockletToRowCountMap.put("RowCount", totalRowCount);
+      } else {
+        blockletToRowCountMap.put("RowCount", totalRowCount + rowCount);
 
 Review comment:
   HashMap is used to hold the map of block path vs the row count.
   We have 2 cases:
   (1) IUD table, in case of IUD table we need the actual block path from here so that it can be compared with the UpdateStatusManager later.
   (2) Normal table(without IUD), where we don't need to do any comparisons later, so we are just putting only one entry to the Map with dummy key as "RowCount" and updating the value.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services