Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] carbondata pull request #2540: [WIP] Handled executor min/max pruning when f...

Classic

List

17 messages Options

Options

[GitHub] carbondata pull request #2540: [WIP] Handled executor min/max pruning when f...

GitHub user manishgupta88 opened a pull request:

https://github.com/apache/carbondata/pull/2540

[WIP] Handled executor min/max pruning when filter column in not cached in driver for CACHE_LEVEL=BLOCKLET

Things handled as part of this PR:
1. Modified code to use min/max in executor pruning for Blocklet dataMap when filter column min/max is not cached in driver. When column to be cached in driver are specified and CACHE_LEVEL = BLOCKLET, then executor min/max pruning was not happening which can increase the query time.

2. Removed unwanted addition of schemaEvolutionEntry to schema on Alter SET and UNSET table properties

- [ ] Any interfaces changed?
No
- [ ] Any backward compatibility impacted?
No
- [ ] Document update required?
No
- [ ] Testing done
Yes
- [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
NA

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/manishgupta88/carbondata query_slow_executor_pruning

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2540.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2540

----
commit 6f55b5fafe8214e939f763f750382bbf0bfdcb42
Author: manishgupta88 <tomanishgupta18@...>
Date: 2018-07-23T06:21:23Z

Modified code to use min/max in executor pruning for Blocklet data map when filter column min/max is not cached in driver
Removed unwanted addition of schemaEvolutionEntry to schema on Alter SET and UNSET table properties

----

---

[GitHub] carbondata issue #2540: [CARBONDATA-2649] Handled executor min/max pruning w...

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2540

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6162/

---

[GitHub] carbondata issue #2540: [CARBONDATA-2649] Handled executor min/max pruning w...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2540

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7405/

---

[GitHub] carbondata issue #2540: [CARBONDATA-2649] Handled executor min/max pruning w...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2540

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6169/

---

[GitHub] carbondata issue #2540: [CARBONDATA-2649] Handled executor min/max pruning w...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2540

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5971/

---

[GitHub] carbondata pull request #2540: [CARBONDATA-2649] Handled executor min/max pr...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2540#discussion_r204630970

--- Diff: core/src/main/java/org/apache/carbondata/core/metadata/blocklet/BlockletInfo.java ---
@@ -221,7 +223,30 @@ public void setNumberOfPages(int numberOfPages) {
output.writeInt(measureChunksLength.get(i));
}
writeChunkInfoForOlderVersions(output);
+ serializeMinMaxValues(output);
+ }

+ /**
+ * serialize min max values
+ *
+ * @param output
+ * @throws IOException
+ */
+ private void serializeMinMaxValues(DataOutput output) throws IOException {
--- End diff --

I don't think it is required to serailaize the min/max from driver. if columns are not cached then read footer from executor side.

---

[GitHub] carbondata pull request #2540: [CARBONDATA-2649] Handled executor min/max pr...

In reply to this post by qiuchenjian-2

Github user manishgupta88 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2540#discussion_r204682071

--- Diff: core/src/main/java/org/apache/carbondata/core/metadata/blocklet/BlockletInfo.java ---
@@ -221,7 +223,30 @@ public void setNumberOfPages(int numberOfPages) {
output.writeInt(measureChunksLength.get(i));
}
writeChunkInfoForOlderVersions(output);
+ serializeMinMaxValues(output);
+ }

+ /**
+ * serialize min max values
+ *
+ * @param output
+ * @throws IOException
+ */
+ private void serializeMinMaxValues(DataOutput output) throws IOException {
--- End diff --

ok..I will remove serialization of min/max and read footer using useMinMaxForPruning flag

---

[GitHub] carbondata issue #2540: [CARBONDATA-2649] Handled executor min/max pruning w...

In reply to this post by qiuchenjian-2

Github user manishgupta88 commented on the issue:

https://github.com/apache/carbondata/pull/2540

@ravipesala ...handled review comments..please review and merge

---

[GitHub] carbondata issue #2540: [CARBONDATA-2649] Handled executor min/max pruning w...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2540

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7448/

---

[GitHub] carbondata issue #2540: [CARBONDATA-2649] Handled executor min/max pruning w...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2540

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6203/

---

[GitHub] carbondata issue #2540: [CARBONDATA-2649] Handled executor min/max pruning w...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2540

retest this please

---

[GitHub] carbondata issue #2540: [CARBONDATA-2649] Handled executor min/max pruning w...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2540

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7458/

---

[GitHub] carbondata issue #2540: [CARBONDATA-2649] Handled executor min/max pruning w...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2540

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6213/

---

[GitHub] carbondata issue #2540: [CARBONDATA-2649] Handled executor min/max pruning w...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2540

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5983/

---

[GitHub] carbondata issue #2540: [CARBONDATA-2649] Handled executor min/max pruning w...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2540

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6222/

---

[GitHub] carbondata issue #2540: [CARBONDATA-2649] Handled executor min/max pruning w...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2540

LGTM

---

[GitHub] carbondata pull request #2540: [CARBONDATA-2649] Handled executor min/max pr...

In reply to this post by qiuchenjian-2

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/2540

---