ajantha-bhat opened a new pull request #3913: URL: https://github.com/apache/carbondata/pull/3913 ### Why is this PR needed? a) For 200K segments table in cloud, presto partition query was taking more than 5 hours. the reason is it was reading all segment files for partition pruning. Now it is less than a minute ! ### What changes were proposed in this PR? a) HiveTableHandle already have partition spec, matching for the filters (it has queried metastore to get all partitions and pruned it). So, create partitionSpec based on that. Also handled for both prestodb and prestosql b) #3885 , broke prestodb compilation, only prestosql is compiled. c) #3887, also didn't handled prestodb ### Does this PR introduce any user interface change? - No ### Is any new testcase added? - No [Need to add spark support and create better UT for presto, TODO] verified manually ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
CarbonDataQA1 commented on pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#issuecomment-687331617 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3979/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#issuecomment-687335279 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2239/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#issuecomment-688316413 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3984/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#issuecomment-688323439 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2244/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#issuecomment-688403371 retest this please ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#issuecomment-688403784 PR is ready. Please review @QiangCai , @kunal642 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#issuecomment-688455154 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3990/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#issuecomment-688455954 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2251/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
marchpure commented on pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#issuecomment-688802921 retest this please ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
Indhumathi27 commented on a change in pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#discussion_r484845416 ########## File path: integration/presto/src/main/prestodb/org/apache/carbondata/presto/CarbondataSplitManager.java ########## @@ -117,6 +122,16 @@ public ConnectorSplitSource getSplits(ConnectorTransactionHandle transactionHand // file metastore case tablePath can be null, so get from location location = table.getStorage().getLocation(); } + List<PartitionSpec> filteredPartitions = new ArrayList<>(); Review comment: Can you add a testcase with partition filter? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#discussion_r484845973 ########## File path: integration/presto/src/main/prestodb/org/apache/carbondata/presto/CarbondataSplitManager.java ########## @@ -117,6 +122,16 @@ public ConnectorSplitSource getSplits(ConnectorTransactionHandle transactionHand // file metastore case tablePath can be null, so get from location location = table.getStorage().getLocation(); } + List<PartitionSpec> filteredPartitions = new ArrayList<>(); Review comment: please read the description, I have mentioned why UT cannot be added now ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#discussion_r484845973 ########## File path: integration/presto/src/main/prestodb/org/apache/carbondata/presto/CarbondataSplitManager.java ########## @@ -117,6 +122,16 @@ public ConnectorSplitSource getSplits(ConnectorTransactionHandle transactionHand // file metastore case tablePath can be null, so get from location location = table.getStorage().getLocation(); } + List<PartitionSpec> filteredPartitions = new ArrayList<>(); Review comment: please read the description, I have already mentioned why UT cannot be added now ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
Indhumathi27 commented on a change in pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#discussion_r484849645 ########## File path: integration/presto/src/main/prestosql/org/apache/carbondata/presto/impl/CarbonTableReader.java ########## @@ -245,16 +242,14 @@ private CarbonTableCacheModel getValidCacheBySchemaTableName(SchemaTableName sch * * @param tableCacheModel cached table * @param filters carbonData filters - * @param constraints presto filters + * @param filteredPartitions matched partitionSpec for the filter * @param config hadoop conf * @return list of multiblock split * @throws IOException */ - public List<CarbonLocalMultiBlockSplit> getInputSplits( - CarbonTableCacheModel tableCacheModel, - Expression filters, - TupleDomain<HiveColumnHandle> constraints, - Configuration config) throws IOException { + public List<CarbonLocalMultiBlockSplit> getInputSplits(CarbonTableCacheModel tableCacheModel, + Expression filters, List<PartitionSpec> filteredPartitions, Configuration config) Review comment: Can revert to old style ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
Indhumathi27 commented on a change in pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#discussion_r484851607 ########## File path: integration/presto/src/test/prestodb/org/apache/carbondata/presto/server/PrestoTestUtil.scala ########## @@ -0,0 +1,118 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.presto.server + +import com.facebook.presto.jdbc.PrestoArray + Review comment: Remove extra lines ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#issuecomment-688858320 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4003/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#issuecomment-688859734 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2263/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
marchpure commented on pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#issuecomment-689003409 I just tested. With this PR. Query nonpartition table will has EMPTY RESULT. Query parititon table works well ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
marchpure removed a comment on pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#issuecomment-689003409 I just tested. With this PR. Query nonpartition table will has EMPTY RESULT. Query parititon table works well ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
kunal642 commented on a change in pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#discussion_r485626615 ########## File path: integration/presto/src/main/prestodb/org/apache/carbondata/presto/CarbondataModule.java ########## @@ -21,6 +21,8 @@ import static java.util.Objects.requireNonNull; +import org.apache.carbondata.core.constants.CarbonCommonConstants; +import org.apache.carbondata.core.util.CarbonProperties; Review comment: Why this change is required? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
Free forum by Nabble | Edit this page |