GitHub user jackylk opened a pull request:
https://github.com/apache/carbondata/pull/2197 [WIP] Add Profiler output in EXPLAIN command Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jackylk/incubator-carbondata profiler Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2197.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2197 ---- commit 028106ffed51bc0b8fa3239dd1505df685d1e60c Author: Jacky Li <jacky.likun@...> Date: 2018-04-20T08:02:32Z support profiler in EXPLAIN commit 8ce5194b594a1be60b1a22116777df21bc47d477 Author: Jacky Li <jacky.likun@...> Date: 2018-04-20T10:06:17Z add test ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4053/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5240/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4066/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5251/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4072/ --- |
In reply to this post by qiuchenjian-2
Github user QiangCai commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2197#discussion_r183196989 --- Diff: dev/findbugs-exclude.xml --- @@ -40,7 +40,7 @@ <Bug pattern="MS_MUTABLE_ARRAY"/> </Match> <Match> - <Class name="org.apache.carbondata.core.scan.expression.ExpressionResult"/> + <Class name="org.apache.carbondata.core.scan.filterExpression.ExpressionResult"/> --- End diff -- not required match --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2197#discussion_r183199499 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggregateTableSelection.scala --- @@ -379,6 +382,68 @@ class TestPreAggregateTableSelection extends SparkQueryTest with BeforeAndAfterA checkAnswer(df, Seq(Row(10,10.0))) } + test("explain query") { + sql("explain select name,sum(age) from mainTable where name = 'a' group by name").show(false) + var rows = sql("explain select name,sum(age) from mainTable where name = 'a' group by name").collect() + assertResult( + """== CarbonData Profiler == + |Query rewrite based on DataMap: + | - agg1 (preaggregate) + |Table Scan on maintable_agg1 + | - filter: (maintable_name <> null and maintable_name = a) + | - pruned by main index + | - all blocklets: 1 + | - skipped blocklets: 1 + |""".stripMargin)(rows(0).getString(0)) + + rows = sql("explain select name,sum(age) from mainTable group by name").collect() + assertResult( + """== CarbonData Profiler == + |Query rewrite based on DataMap: + | - agg1 (preaggregate) + |Table Scan on maintable_agg1 + | - filter: None + | - all blocklets: 1 + | - skipped blocklets: 0 + |""".stripMargin)(rows(0).getString(0)) + } + + test("explain query with lucene datamap") { --- End diff -- better to move this testcase to lucene module or add a separate testsuite for profile --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2197#discussion_r183200529 --- Diff: dev/findbugs-exclude.xml --- @@ -40,7 +40,7 @@ <Bug pattern="MS_MUTABLE_ARRAY"/> </Match> <Match> - <Class name="org.apache.carbondata.core.scan.expression.ExpressionResult"/> + <Class name="org.apache.carbondata.core.scan.filterExpression.ExpressionResult"/> --- End diff -- fixed --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2197#discussion_r183200584 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggregateTableSelection.scala --- @@ -379,6 +382,68 @@ class TestPreAggregateTableSelection extends SparkQueryTest with BeforeAndAfterA checkAnswer(df, Seq(Row(10,10.0))) } + test("explain query") { + sql("explain select name,sum(age) from mainTable where name = 'a' group by name").show(false) + var rows = sql("explain select name,sum(age) from mainTable where name = 'a' group by name").collect() + assertResult( + """== CarbonData Profiler == + |Query rewrite based on DataMap: + | - agg1 (preaggregate) + |Table Scan on maintable_agg1 + | - filter: (maintable_name <> null and maintable_name = a) + | - pruned by main index + | - all blocklets: 1 + | - skipped blocklets: 1 + |""".stripMargin)(rows(0).getString(0)) + + rows = sql("explain select name,sum(age) from mainTable group by name").collect() + assertResult( + """== CarbonData Profiler == + |Query rewrite based on DataMap: + | - agg1 (preaggregate) + |Table Scan on maintable_agg1 + | - filter: None + | - all blocklets: 1 + | - skipped blocklets: 0 + |""".stripMargin)(rows(0).getString(0)) + } + + test("explain query with lucene datamap") { --- End diff -- I have moved it to LuceneFineGrainDataMapSuite --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4086/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5266/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5286/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4106/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5309/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4129/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4136/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2197#discussion_r183283756 --- Diff: core/src/main/java/org/apache/carbondata/core/profiler/ExplainCollector.java --- @@ -0,0 +1,146 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.core.profiler; + +import java.util.ArrayList; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; + +import org.apache.carbondata.common.annotations.InterfaceAudience; +import org.apache.carbondata.core.metadata.schema.table.DataMapSchema; + +/** + * An information collector used for EXPLAIN command, to print out + * SQL rewrite and pruning information + */ +@InterfaceAudience.Internal +public class ExplainCollector { + + private static final ThreadLocal<ExplainCollector> explainProfiler = new ThreadLocal<>(); + + private List<String> olapDataMapProviders = new ArrayList<>(); + private List<String> olapDataMapNames = new ArrayList<>(); + + // mapping of table name to pruning info + private Map<String, TablePruningInfo> scans = new ConcurrentHashMap<>(); + + public void recordMatchedOlapDataMap(String dataMapProvider, String dataMapName) { + Objects.requireNonNull(dataMapProvider); + Objects.requireNonNull(dataMapName); + olapDataMapProviders.add(dataMapProvider); + olapDataMapNames.add(dataMapName); + } + + public static boolean enabled() { + return explainProfiler.get() != null; + } + + public static void setup() { + explainProfiler.set(new ExplainCollector()); + } + + public static ExplainCollector get() { + return explainProfiler.get(); + } + + public static void addPruningInfo(String tableName) { + if (enabled()) { + ExplainCollector profiler = get(); + if (!profiler.scans.containsKey(tableName)) { + profiler.scans.put(tableName, new TablePruningInfo()); + } + } + } + + public static void setFilterStatement(String filterStatement) { + if (enabled()) { + TablePruningInfo scan = getCurrentTablePruningInfo(); + scan.setFilterStatement(filterStatement); + } + } + + public static void recordDefaultDataMapPruning(DataMapSchema dataMapSchema, int numBlocklets) { + if (enabled()) { + TablePruningInfo scan = getCurrentTablePruningInfo(); + scan.setNumBlockletsAfterDefaultPruning(dataMapSchema, numBlocklets); + } + } + + public static void recordCGDataMapPruning(DataMapSchema dataMapSchema, int numBlocklets) { + if (enabled()) { + TablePruningInfo scan = getCurrentTablePruningInfo(); + scan.setNumBlockletsAfterCGPruning(dataMapSchema, numBlocklets); + } + } + + public static void recordFGDataMapPruning(DataMapSchema dataMapSchema, int numBlocklets) { + if (enabled()) { + TablePruningInfo scan = getCurrentTablePruningInfo(); + scan.setNumBlockletsAfterFGPruning(dataMapSchema, numBlocklets); + } + } + + public static void setTotalBlocklets(int totalBlocklets) { + if (enabled()) { + TablePruningInfo scan = getCurrentTablePruningInfo(); + scan.setTotalBlocklets(totalBlocklets); + } + } + + /** + * Return the current TablePruningInfo (It is the last one in the map, since it is in + * single thread) + */ + private static TablePruningInfo getCurrentTablePruningInfo() { + Iterator<TablePruningInfo> iterator = explainProfiler.get().scans.values().iterator(); + TablePruningInfo output = null; + while (iterator.hasNext()) { + output = iterator.next(); + } + return output; + } + + public static void remove() { + explainProfiler.remove(); --- End diff -- Better check enabled() here as well --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2197#discussion_r183284986 --- Diff: core/src/main/java/org/apache/carbondata/core/profiler/ExplainCollector.java --- @@ -0,0 +1,146 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.core.profiler; + +import java.util.ArrayList; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; + +import org.apache.carbondata.common.annotations.InterfaceAudience; +import org.apache.carbondata.core.metadata.schema.table.DataMapSchema; + +/** + * An information collector used for EXPLAIN command, to print out + * SQL rewrite and pruning information + */ +@InterfaceAudience.Internal +public class ExplainCollector { + + private static final ThreadLocal<ExplainCollector> explainProfiler = new ThreadLocal<>(); + + private List<String> olapDataMapProviders = new ArrayList<>(); + private List<String> olapDataMapNames = new ArrayList<>(); + + // mapping of table name to pruning info + private Map<String, TablePruningInfo> scans = new ConcurrentHashMap<>(); + + public void recordMatchedOlapDataMap(String dataMapProvider, String dataMapName) { + Objects.requireNonNull(dataMapProvider); + Objects.requireNonNull(dataMapName); + olapDataMapProviders.add(dataMapProvider); + olapDataMapNames.add(dataMapName); + } + + public static boolean enabled() { + return explainProfiler.get() != null; + } + + public static void setup() { + explainProfiler.set(new ExplainCollector()); + } + + public static ExplainCollector get() { + return explainProfiler.get(); + } + + public static void addPruningInfo(String tableName) { + if (enabled()) { + ExplainCollector profiler = get(); + if (!profiler.scans.containsKey(tableName)) { + profiler.scans.put(tableName, new TablePruningInfo()); + } + } + } + + public static void setFilterStatement(String filterStatement) { + if (enabled()) { + TablePruningInfo scan = getCurrentTablePruningInfo(); + scan.setFilterStatement(filterStatement); + } + } + + public static void recordDefaultDataMapPruning(DataMapSchema dataMapSchema, int numBlocklets) { + if (enabled()) { + TablePruningInfo scan = getCurrentTablePruningInfo(); + scan.setNumBlockletsAfterDefaultPruning(dataMapSchema, numBlocklets); + } + } + + public static void recordCGDataMapPruning(DataMapSchema dataMapSchema, int numBlocklets) { + if (enabled()) { + TablePruningInfo scan = getCurrentTablePruningInfo(); + scan.setNumBlockletsAfterCGPruning(dataMapSchema, numBlocklets); + } + } + + public static void recordFGDataMapPruning(DataMapSchema dataMapSchema, int numBlocklets) { + if (enabled()) { + TablePruningInfo scan = getCurrentTablePruningInfo(); + scan.setNumBlockletsAfterFGPruning(dataMapSchema, numBlocklets); + } + } + + public static void setTotalBlocklets(int totalBlocklets) { + if (enabled()) { + TablePruningInfo scan = getCurrentTablePruningInfo(); + scan.setTotalBlocklets(totalBlocklets); + } + } + + /** + * Return the current TablePruningInfo (It is the last one in the map, since it is in + * single thread) + */ + private static TablePruningInfo getCurrentTablePruningInfo() { + Iterator<TablePruningInfo> iterator = explainProfiler.get().scans.values().iterator(); + TablePruningInfo output = null; + while (iterator.hasNext()) { + output = iterator.next(); + } + return output; + } + + public static void remove() { + explainProfiler.remove(); --- End diff -- fixed --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2197#discussion_r183285491 --- Diff: core/src/main/java/org/apache/carbondata/core/profiler/ExplainCollector.java --- @@ -0,0 +1,146 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.core.profiler; + +import java.util.ArrayList; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; + +import org.apache.carbondata.common.annotations.InterfaceAudience; +import org.apache.carbondata.core.metadata.schema.table.DataMapSchema; + +/** + * An information collector used for EXPLAIN command, to print out + * SQL rewrite and pruning information + */ +@InterfaceAudience.Internal +public class ExplainCollector { + + private static final ThreadLocal<ExplainCollector> explainProfiler = new ThreadLocal<>(); + + private List<String> olapDataMapProviders = new ArrayList<>(); + private List<String> olapDataMapNames = new ArrayList<>(); + + // mapping of table name to pruning info + private Map<String, TablePruningInfo> scans = new ConcurrentHashMap<>(); + + public void recordMatchedOlapDataMap(String dataMapProvider, String dataMapName) { + Objects.requireNonNull(dataMapProvider); + Objects.requireNonNull(dataMapName); + olapDataMapProviders.add(dataMapProvider); + olapDataMapNames.add(dataMapName); + } + + public static boolean enabled() { + return explainProfiler.get() != null; + } + + public static void setup() { + explainProfiler.set(new ExplainCollector()); + } + + public static ExplainCollector get() { + return explainProfiler.get(); + } + + public static void addPruningInfo(String tableName) { + if (enabled()) { + ExplainCollector profiler = get(); + if (!profiler.scans.containsKey(tableName)) { + profiler.scans.put(tableName, new TablePruningInfo()); + } + } + } + + public static void setFilterStatement(String filterStatement) { + if (enabled()) { + TablePruningInfo scan = getCurrentTablePruningInfo(); + scan.setFilterStatement(filterStatement); + } + } + + public static void recordDefaultDataMapPruning(DataMapSchema dataMapSchema, int numBlocklets) { + if (enabled()) { + TablePruningInfo scan = getCurrentTablePruningInfo(); + scan.setNumBlockletsAfterDefaultPruning(dataMapSchema, numBlocklets); + } + } + + public static void recordCGDataMapPruning(DataMapSchema dataMapSchema, int numBlocklets) { + if (enabled()) { + TablePruningInfo scan = getCurrentTablePruningInfo(); + scan.setNumBlockletsAfterCGPruning(dataMapSchema, numBlocklets); + } + } + + public static void recordFGDataMapPruning(DataMapSchema dataMapSchema, int numBlocklets) { + if (enabled()) { + TablePruningInfo scan = getCurrentTablePruningInfo(); + scan.setNumBlockletsAfterFGPruning(dataMapSchema, numBlocklets); + } + } + + public static void setTotalBlocklets(int totalBlocklets) { + if (enabled()) { + TablePruningInfo scan = getCurrentTablePruningInfo(); + scan.setTotalBlocklets(totalBlocklets); + } + } + + /** + * Return the current TablePruningInfo (It is the last one in the map, since it is in + * single thread) + */ + private static TablePruningInfo getCurrentTablePruningInfo() { --- End diff -- How can you make sure that you are adding information to right table. I think it is better to pass tableName and get the `TablePruningInfo` as per that. --- |
Free forum by Nabble | Edit this page |