GitHub user kevinjmh opened a pull request:
https://github.com/apache/carbondata/pull/2900 [CARBONDATA-3078] Disable explain collector for count star query without filter An issue is found about count star query without filter in explain command. It is a special case. It uses different plan. Considering no useful information about block/blocklet pruning for count star query without filter, so disable explain collector and avoid the exception in https://issues.apache.org/jira/browse/CARBONDATA-3078 Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kevinjmh/carbondata explain_countstar Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2900.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2900 ---- commit 46f07a25924e98d43dd9dcfad3a1c7cb0cd4d895 Author: Manhua <kevinjmh@...> Date: 2018-11-05T12:17:59Z disable explain collector for count star query without filter ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2900 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1276/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2900 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1495/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2900 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9545/ --- |
In reply to this post by qiuchenjian-2
Github user kevinjmh commented on the issue:
https://github.com/apache/carbondata/pull/2900 retest this please --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2900#discussion_r230970553 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java --- @@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter, */ public BlockMappingVO getBlockRowCount(Job job, CarbonTable table, List<PartitionSpec> partitions) throws IOException { + // no useful information for count star query without filter, so disable explain collector + ExplainCollector.remove(); --- End diff -- I think this modification just try to avoid the problem but don't actually solve the problem. Can you explain what is the root cause of that problem? --- |
In reply to this post by qiuchenjian-2
Github user kevinjmh commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2900#discussion_r230972236 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java --- @@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter, */ public BlockMappingVO getBlockRowCount(Job job, CarbonTable table, List<PartitionSpec> partitions) throws IOException { + // no useful information for count star query without filter, so disable explain collector + ExplainCollector.remove(); --- End diff -- You are right. Normal query flow goes to `CarbonInputFormat#getPrunedBlocklets` and initialize the pruning info for table we queried. But count star query without filter use a different query plan, it does not go into that method, so no pruning info does not init. When it calls default data map to prune(with a null filter), exception will occur during settingg pruning info. One solution is to init the pruning info for this type of query in mrthod `getBlockRowCount`. But considering no useful information about block/blocklet pruning for such query(actually no pruning), I choose to disable the expalin collector instead. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2900 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1294/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2900 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1507/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2900 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9555/ --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2900#discussion_r231010174 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java --- @@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter, */ public BlockMappingVO getBlockRowCount(Job job, CarbonTable table, List<PartitionSpec> partitions) throws IOException { + // no useful information for count star query without filter, so disable explain collector + ExplainCollector.remove(); --- End diff -- That's the weird part -- We are trying to remove the pruning collector even the pruning info is not initialized. I think you can add a flag for the collector to identify whether it is initialized. And this flag is used where carbon what to record the info. If you are planing to work like this, please add a comment for the scenario of this variable. --- |
In reply to this post by qiuchenjian-2
Github user kevinjmh commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2900#discussion_r231010531 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java --- @@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter, */ public BlockMappingVO getBlockRowCount(Job job, CarbonTable table, List<PartitionSpec> partitions) throws IOException { + // no useful information for count star query without filter, so disable explain collector + ExplainCollector.remove(); --- End diff -- No remove. Its implementation is disable. --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2900#discussion_r231012866 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java --- @@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter, */ public BlockMappingVO getBlockRowCount(Job job, CarbonTable table, List<PartitionSpec> partitions) throws IOException { + // no useful information for count star query without filter, so disable explain collector + ExplainCollector.remove(); --- End diff -- oh... I understand. The current implementation of pruning collector may has bugs. Based on the current implementation, your modification is OK... --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2900#discussion_r231012917 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java --- @@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter, */ public BlockMappingVO getBlockRowCount(Job job, CarbonTable table, List<PartitionSpec> partitions) throws IOException { + // no useful information for count star query without filter, so disable explain collector + ExplainCollector.remove(); --- End diff -- please add comments for your modification in the code for better understanding --- |
In reply to this post by qiuchenjian-2
Github user kevinjmh commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2900#discussion_r231014871 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java --- @@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter, */ public BlockMappingVO getBlockRowCount(Job job, CarbonTable table, List<PartitionSpec> partitions) throws IOException { + // no useful information for count star query without filter, so disable explain collector + ExplainCollector.remove(); --- End diff -- OK --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2900 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1304/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2900 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1519/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2900 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9568/ --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:
https://github.com/apache/carbondata/pull/2900 LGTM --- |
In reply to this post by qiuchenjian-2
|
Free forum by Nabble | Edit this page |