[GitHub] [carbondata] Indhumathi27 opened a new pull request #3584: [WIP] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache

classic Classic list List threaded Threaded
209 messages Options
1 ... 3456789 ... 11
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache

GitBox
CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#issuecomment-591336113
 
 
   Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/495/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#issuecomment-591372425
 
 
   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2195/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache

GitBox
In reply to this post by GitBox
Indhumathi27 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#issuecomment-593217686
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#issuecomment-593221565
 
 
   Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/552/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#issuecomment-593242553
 
 
   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2253/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache

GitBox
In reply to this post by GitBox
Indhumathi27 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#issuecomment-594420696
 
 
   @akashrn5 please review

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#issuecomment-595044228
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2332/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#issuecomment-595045397
 
 
   Build Failed  with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/626/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#issuecomment-595102579
 
 
   Build Failed  with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/628/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#issuecomment-595103103
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2334/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#issuecomment-595103783
 
 
   Build Failed  with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/632/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#issuecomment-595104273
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2339/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#issuecomment-596333216
 
 
   Build Failed  with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/684/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#issuecomment-596333435
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2390/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#issuecomment-596356248
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2391/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#issuecomment-596357267
 
 
   Build Failed  with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/685/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#issuecomment-596455074
 
 
   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2396/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#issuecomment-596455376
 
 
   Build Failed  with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/690/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache

GitBox
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#discussion_r389603171
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java
 ##########
 @@ -135,7 +135,7 @@ public CarbonTable getTable() {
     int datamapsCount = 0;
     // In case if filter has matched partitions, then update the segments with datamap's
     // segment list, as getDataMaps will return segments that matches the partition.
-    if (null != partitions && !partitions.isEmpty()) {
+    if (null != partitions && !partitions.isEmpty() || (null != filter && !filter.isEmpty())) {
 
 Review comment:
   Suppose to be && instead of || ?
   
   line 123:
   boolean usePartitionInfoForDataMapPruning = table.isHivePartitionTable() && filter != null && !filter.isEmpty() && partitions != null
   
   and use the same flag to reset in line 138, it is confusing now

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache

GitBox
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#discussion_r389613520
 
 

 ##########
 File path: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala
 ##########
 @@ -479,7 +488,17 @@ object CarbonDataRDDFactory {
             segmentDetails.add(new Segment(resultOfBlock._2._1.getLoadName))
           }
         }
-        val segmentFiles = updateSegmentFiles(carbonTable, segmentDetails, updateModel.get)
+        var segmentMinMaxMap: Map[String, List[SegmentMinMax]] = Map()
+        if (!segmentMinMaxAccumulator.isZero) {
+          segmentMinMaxAccumulator.value.asScala.foreach(map => if (map.nonEmpty) {
+            segmentMinMaxMap = segmentMinMaxMap ++ map
 
 Review comment:
   keep a mutable map, no need to create new map each time

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
1 ... 3456789 ... 11