[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

qiuchenjian-2
GitHub user kevinjmh opened a pull request:

    https://github.com/apache/carbondata/pull/2900

    [CARBONDATA-3078] Disable explain collector for count star query without filter

    An issue is found about count star query without filter in explain command. It is a special case. It uses different plan.
    Considering
    no useful information about block/blocklet pruning for count star query without filter, so disable explain collector and avoid the exception in https://issues.apache.org/jira/browse/CARBONDATA-3078
   
    Be sure to do all of the following checklist to help us incorporate
    your contribution quickly and easily:
   
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
   
     - [ ] Testing done
            Please provide details on
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kevinjmh/carbondata explain_countstar

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2900.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2900
   
----
commit 46f07a25924e98d43dd9dcfad3a1c7cb0cd4d895
Author: Manhua <kevinjmh@...>
Date:   2018-11-05T12:17:59Z

    disable explain collector for count star query without filter

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2900
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1276/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2900
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1495/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2900
 
    Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9545/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kevinjmh commented on the issue:

    https://github.com/apache/carbondata/pull/2900
 
    retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2900#discussion_r230970553
 
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
    @@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter,
        */
       public BlockMappingVO getBlockRowCount(Job job, CarbonTable table,
           List<PartitionSpec> partitions) throws IOException {
    +    // no useful information for count star query without filter, so disable explain collector
    +    ExplainCollector.remove();
    --- End diff --
   
    I think this modification just try to avoid the problem but don't actually solve the problem.
    Can you explain what is the root cause of that problem?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kevinjmh commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2900#discussion_r230972236
 
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
    @@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter,
        */
       public BlockMappingVO getBlockRowCount(Job job, CarbonTable table,
           List<PartitionSpec> partitions) throws IOException {
    +    // no useful information for count star query without filter, so disable explain collector
    +    ExplainCollector.remove();
    --- End diff --
   
    You are right.
    Normal query flow goes to `CarbonInputFormat#getPrunedBlocklets` and initialize the pruning info for table we queried.  But count star query without filter use a different query plan, it does not go into that method, so no pruning info does not init. When it calls default data map to prune(with a null filter), exception will occur during settingg pruning info.
   
    One solution is to init the pruning info for this type of query in mrthod `getBlockRowCount`. But considering
    no useful information about block/blocklet pruning for such query(actually no pruning), I choose to disable the expalin collector instead.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2900
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1294/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2900
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1507/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2900
 
    Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9555/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2900#discussion_r231010174
 
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
    @@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter,
        */
       public BlockMappingVO getBlockRowCount(Job job, CarbonTable table,
           List<PartitionSpec> partitions) throws IOException {
    +    // no useful information for count star query without filter, so disable explain collector
    +    ExplainCollector.remove();
    --- End diff --
   
    That's the weird part -- We are trying to remove the pruning collector even the pruning info is not initialized.
    I think you can add a flag for the collector to identify whether it is initialized. And this flag is used where carbon what to record the info. If you are planing to work like this, please add a comment for the scenario of this variable.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kevinjmh commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2900#discussion_r231010531
 
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
    @@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter,
        */
       public BlockMappingVO getBlockRowCount(Job job, CarbonTable table,
           List<PartitionSpec> partitions) throws IOException {
    +    // no useful information for count star query without filter, so disable explain collector
    +    ExplainCollector.remove();
    --- End diff --
   
    No remove. Its implementation is disable.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2900#discussion_r231012866
 
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
    @@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter,
        */
       public BlockMappingVO getBlockRowCount(Job job, CarbonTable table,
           List<PartitionSpec> partitions) throws IOException {
    +    // no useful information for count star query without filter, so disable explain collector
    +    ExplainCollector.remove();
    --- End diff --
   
    oh... I understand.
    The current implementation of pruning collector may has bugs. Based on the current implementation, your modification is OK...


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2900#discussion_r231012917
 
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
    @@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter,
        */
       public BlockMappingVO getBlockRowCount(Job job, CarbonTable table,
           List<PartitionSpec> partitions) throws IOException {
    +    // no useful information for count star query without filter, so disable explain collector
    +    ExplainCollector.remove();
    --- End diff --
   
    please add comments for your modification in the code for better understanding


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kevinjmh commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2900#discussion_r231014871
 
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
    @@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter,
        */
       public BlockMappingVO getBlockRowCount(Job job, CarbonTable table,
           List<PartitionSpec> partitions) throws IOException {
    +    // no useful information for count star query without filter, so disable explain collector
    +    ExplainCollector.remove();
    --- End diff --
   
    OK


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2900
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1304/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2900
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1519/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2900
 
    Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9568/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/2900
 
    LGTM


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/2900


---