Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

Classic

List

20 messages Options

Options

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

GitHub user kevinjmh opened a pull request:

https://github.com/apache/carbondata/pull/2900

[CARBONDATA-3078] Disable explain collector for count star query without filter

An issue is found about count star query without filter in explain command. It is a special case. It uses different plan.
Considering
no useful information about block/blocklet pruning for count star query without filter, so disable explain collector and avoid the exception in https://issues.apache.org/jira/browse/CARBONDATA-3078

Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:

- [ ] Any interfaces changed?

- [ ] Any backward compatibility impacted?

- [ ] Document update required?

- [ ] Testing done
Please provide details on
- Whether new unit test cases have been added or why no new tests are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance test report.
- Any additional information to help reviewers in testing this change.

- [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kevinjmh/carbondata explain_countstar

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2900.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2900

----
commit 46f07a25924e98d43dd9dcfad3a1c7cb0cd4d895
Author: Manhua <kevinjmh@...>
Date: 2018-11-05T12:17:59Z

disable explain collector for count star query without filter

----

---

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2900

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1276/

---

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2900

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1495/

---

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2900

Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9545/

---

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

In reply to this post by qiuchenjian-2

Github user kevinjmh commented on the issue:

https://github.com/apache/carbondata/pull/2900

retest this please

---

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

In reply to this post by qiuchenjian-2

Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2900#discussion_r230970553

--- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
@@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter,
*/
public BlockMappingVO getBlockRowCount(Job job, CarbonTable table,
List<PartitionSpec> partitions) throws IOException {
+ // no useful information for count star query without filter, so disable explain collector
+ ExplainCollector.remove();
--- End diff --

I think this modification just try to avoid the problem but don't actually solve the problem.
Can you explain what is the root cause of that problem?

---

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

In reply to this post by qiuchenjian-2

Github user kevinjmh commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2900#discussion_r230972236

--- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
@@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter,
*/
public BlockMappingVO getBlockRowCount(Job job, CarbonTable table,
List<PartitionSpec> partitions) throws IOException {
+ // no useful information for count star query without filter, so disable explain collector
+ ExplainCollector.remove();
--- End diff --

You are right.
Normal query flow goes to `CarbonInputFormat#getPrunedBlocklets` and initialize the pruning info for table we queried. But count star query without filter use a different query plan, it does not go into that method, so no pruning info does not init. When it calls default data map to prune(with a null filter), exception will occur during settingg pruning info.

One solution is to init the pruning info for this type of query in mrthod `getBlockRowCount`. But considering
no useful information about block/blocklet pruning for such query(actually no pruning), I choose to disable the expalin collector instead.

---

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2900

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1294/

---

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2900

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1507/

---

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2900

Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9555/

---

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

In reply to this post by qiuchenjian-2

Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2900#discussion_r231010174

--- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
@@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter,
*/
public BlockMappingVO getBlockRowCount(Job job, CarbonTable table,
List<PartitionSpec> partitions) throws IOException {
+ // no useful information for count star query without filter, so disable explain collector
+ ExplainCollector.remove();
--- End diff --

That's the weird part -- We are trying to remove the pruning collector even the pruning info is not initialized.
I think you can add a flag for the collector to identify whether it is initialized. And this flag is used where carbon what to record the info. If you are planing to work like this, please add a comment for the scenario of this variable.

---

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

In reply to this post by qiuchenjian-2

Github user kevinjmh commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2900#discussion_r231010531

--- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
@@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter,
*/
public BlockMappingVO getBlockRowCount(Job job, CarbonTable table,
List<PartitionSpec> partitions) throws IOException {
+ // no useful information for count star query without filter, so disable explain collector
+ ExplainCollector.remove();
--- End diff --

No remove. Its implementation is disable.

---

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

In reply to this post by qiuchenjian-2

Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2900#discussion_r231012866

--- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
@@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter,
*/
public BlockMappingVO getBlockRowCount(Job job, CarbonTable table,
List<PartitionSpec> partitions) throws IOException {
+ // no useful information for count star query without filter, so disable explain collector
+ ExplainCollector.remove();
--- End diff --

oh... I understand.
The current implementation of pruning collector may has bugs. Based on the current implementation, your modification is OK...

---

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

In reply to this post by qiuchenjian-2

Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2900#discussion_r231012917

--- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
@@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter,
*/
public BlockMappingVO getBlockRowCount(Job job, CarbonTable table,
List<PartitionSpec> partitions) throws IOException {
+ // no useful information for count star query without filter, so disable explain collector
+ ExplainCollector.remove();
--- End diff --

please add comments for your modification in the code for better understanding

---

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

In reply to this post by qiuchenjian-2

Github user kevinjmh commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2900#discussion_r231014871

--- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
@@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter,
*/
public BlockMappingVO getBlockRowCount(Job job, CarbonTable table,
List<PartitionSpec> partitions) throws IOException {
+ // no useful information for count star query without filter, so disable explain collector
+ ExplainCollector.remove();
--- End diff --

OK

---

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2900

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1304/

---

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2900

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1519/

---

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2900

Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9568/

---

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

In reply to this post by qiuchenjian-2

Github user xuchuanyin commented on the issue:

https://github.com/apache/carbondata/pull/2900

LGTM

---

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

In reply to this post by qiuchenjian-2

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/2900

---