GitHub user manishgupta88 opened a pull request:
https://github.com/apache/carbondata/pull/2540 [WIP] Handled executor min/max pruning when filter column in not cached in driver for CACHE_LEVEL=BLOCKLET Things handled as part of this PR: 1. Modified code to use min/max in executor pruning for Blocklet dataMap when filter column min/max is not cached in driver. When column to be cached in driver are specified and CACHE_LEVEL = BLOCKLET, then executor min/max pruning was not happening which can increase the query time. 2. Removed unwanted addition of schemaEvolutionEntry to schema on Alter SET and UNSET table properties - [ ] Any interfaces changed? No - [ ] Any backward compatibility impacted? No - [ ] Document update required? No - [ ] Testing done Yes - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA You can merge this pull request into a Git repository by running: $ git pull https://github.com/manishgupta88/carbondata query_slow_executor_pruning Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2540.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2540 ---- commit 6f55b5fafe8214e939f763f750382bbf0bfdcb42 Author: manishgupta88 <tomanishgupta18@...> Date: 2018-07-23T06:21:23Z Modified code to use min/max in executor pruning for Blocklet data map when filter column min/max is not cached in driver Removed unwanted addition of schemaEvolutionEntry to schema on Alter SET and UNSET table properties ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2540 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6162/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2540 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7405/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2540 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6169/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2540 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5971/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2540#discussion_r204630970 --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/blocklet/BlockletInfo.java --- @@ -221,7 +223,30 @@ public void setNumberOfPages(int numberOfPages) { output.writeInt(measureChunksLength.get(i)); } writeChunkInfoForOlderVersions(output); + serializeMinMaxValues(output); + } + /** + * serialize min max values + * + * @param output + * @throws IOException + */ + private void serializeMinMaxValues(DataOutput output) throws IOException { --- End diff -- I don't think it is required to serailaize the min/max from driver. if columns are not cached then read footer from executor side. --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2540#discussion_r204682071 --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/blocklet/BlockletInfo.java --- @@ -221,7 +223,30 @@ public void setNumberOfPages(int numberOfPages) { output.writeInt(measureChunksLength.get(i)); } writeChunkInfoForOlderVersions(output); + serializeMinMaxValues(output); + } + /** + * serialize min max values + * + * @param output + * @throws IOException + */ + private void serializeMinMaxValues(DataOutput output) throws IOException { --- End diff -- ok..I will remove serialization of min/max and read footer using useMinMaxForPruning flag --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on the issue:
https://github.com/apache/carbondata/pull/2540 @ravipesala ...handled review comments..please review and merge --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2540 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7448/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2540 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6203/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2540 retest this please --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2540 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7458/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2540 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6213/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2540 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5983/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2540 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6222/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2540 LGTM --- |
In reply to this post by qiuchenjian-2
|
Free forum by Nabble | Edit this page |