Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2936 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1529/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2936 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9787/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2936 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1739/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2936 retest this please --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on the issue:
https://github.com/apache/carbondata/pull/2936 LGTM --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2936 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1543/ --- |
In reply to this post by qiuchenjian-2
|
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2936 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1754/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2936 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9802/ --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2936#discussion_r236564769 --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java --- @@ -63,6 +75,8 @@ private SegmentPropertiesFetcher segmentPropertiesFetcher; + private static final Log LOG = LogFactory.getLog(TableDataMap.class); --- End diff -- We do not use apache-common-logs in carbondata project! Please take care of this --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2936#discussion_r236565153 --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java --- @@ -1399,6 +1399,17 @@ private CarbonCommonConstants() { public static final String CARBON_PUSH_ROW_FILTERS_FOR_VECTOR_DEFAULT = "false"; + /** + * max driver threads used for block pruning [1 to 4 threads] + */ + @CarbonProperty public static final String CARBON_MAX_DRIVER_THREADS_FOR_BLOCK_PRUNING = + "carbon.max.driver.threads.for.block.pruning"; --- End diff -- I think it's better to use the name `carbon.query.pruning.parallelism.driver` --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2936#discussion_r236568719 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonInputFormat.java --- @@ -487,6 +487,8 @@ private int getBlockCount(List<ExtendedBlocklet> blocklets) { // First prune using default datamap on driver side. TableDataMap defaultDataMap = DataMapStoreManager.getInstance().getDefaultDataMap(carbonTable); List<ExtendedBlocklet> prunedBlocklets = null; + // This is to log the event, so user will know what is happening by seeing logs. + LOG.info("Started block pruning ..."); --- End diff -- Instead of adding these logs, I think we'd better add the time consumed for pruning in statistics. --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2936#discussion_r236565449 --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java --- @@ -1399,6 +1399,17 @@ private CarbonCommonConstants() { public static final String CARBON_PUSH_ROW_FILTERS_FOR_VECTOR_DEFAULT = "false"; + /** + * max driver threads used for block pruning [1 to 4 threads] + */ + @CarbonProperty public static final String CARBON_MAX_DRIVER_THREADS_FOR_BLOCK_PRUNING = + "carbon.max.driver.threads.for.block.pruning"; + + public static final String CARBON_MAX_DRIVER_THREADS_FOR_BLOCK_PRUNING_DEFAULT = "4"; + + // block prune in multi-thread if files size more than 100K files. + public static final int CARBON_DRIVER_PRUNING_MULTI_THREAD_ENABLE_FILES_COUNT = 100000; --- End diff -- Why add this constraint? --- |
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2936#discussion_r237173126 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonInputFormat.java --- @@ -487,6 +487,8 @@ private int getBlockCount(List<ExtendedBlocklet> blocklets) { // First prune using default datamap on driver side. TableDataMap defaultDataMap = DataMapStoreManager.getInstance().getDefaultDataMap(carbonTable); List<ExtendedBlocklet> prunedBlocklets = null; + // This is to log the event, so user will know what is happening by seeing logs. + LOG.info("Started block pruning ..."); --- End diff -- log will anyways have timestap, we can subtract stop and start time. I have another non-default datamap PR. I will check about this. --- |
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2936#discussion_r237173725 --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java --- @@ -1399,6 +1399,17 @@ private CarbonCommonConstants() { public static final String CARBON_PUSH_ROW_FILTERS_FOR_VECTOR_DEFAULT = "false"; + /** + * max driver threads used for block pruning [1 to 4 threads] + */ + @CarbonProperty public static final String CARBON_MAX_DRIVER_THREADS_FOR_BLOCK_PRUNING = + "carbon.max.driver.threads.for.block.pruning"; + + public static final String CARBON_MAX_DRIVER_THREADS_FOR_BLOCK_PRUNING_DEFAULT = "4"; + + // block prune in multi-thread if files size more than 100K files. + public static final int CARBON_DRIVER_PRUNING_MULTI_THREAD_ENABLE_FILES_COUNT = 100000; --- End diff -- because driver doing multi-thread default may impact concurrent queries, also by testing observed that 100k datamap takes 1 second. If block pruning taking more than a second then only multi-thead --- |
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2936#discussion_r237173860 --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java --- @@ -1399,6 +1399,17 @@ private CarbonCommonConstants() { public static final String CARBON_PUSH_ROW_FILTERS_FOR_VECTOR_DEFAULT = "false"; + /** + * max driver threads used for block pruning [1 to 4 threads] + */ + @CarbonProperty public static final String CARBON_MAX_DRIVER_THREADS_FOR_BLOCK_PRUNING = + "carbon.max.driver.threads.for.block.pruning"; --- End diff -- I have another non-default datamap PR. I will check about this. I feel this name also OK --- |
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2936#discussion_r237174006 --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java --- @@ -63,6 +75,8 @@ private SegmentPropertiesFetcher segmentPropertiesFetcher; + private static final Log LOG = LogFactory.getLog(TableDataMap.class); --- End diff -- ok --- |
Free forum by Nabble | Edit this page |