[GitHub] [carbondata] QiangCai opened a new pull request #4100: [CARBONDATA-4138] reorder Carbon Expression instead of Spark Filter

classic Classic list List threaded Threaded
29 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4100: [CARBONDATA-4138] Reordering Carbon Expression instead of Spark Filter

GitBox

CarbonDataQA2 commented on pull request #4100:
URL: https://github.com/apache/carbondata/pull/4100#issuecomment-803624034


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/5061/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4100: [CARBONDATA-4138] Reordering Carbon Expression instead of Spark Filter

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4100:
URL: https://github.com/apache/carbondata/pull/4100#issuecomment-803640665


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3309/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on pull request #4100: [CARBONDATA-4138] Reordering Carbon Expression instead of Spark Filter

GitBox
In reply to this post by GitBox

QiangCai commented on pull request #4100:
URL: https://github.com/apache/carbondata/pull/4100#issuecomment-806410550


   retest this please


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4100: [CARBONDATA-4138] Reordering Carbon Expression instead of Spark Filter

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4100:
URL: https://github.com/apache/carbondata/pull/4100#issuecomment-806463869


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3342/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4100: [CARBONDATA-4138] Reordering Carbon Expression instead of Spark Filter

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4100:
URL: https://github.com/apache/carbondata/pull/4100#issuecomment-806464050


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5094/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on pull request #4100: [CARBONDATA-4138] Reordering Carbon Expression instead of Spark Filter

GitBox
In reply to this post by GitBox

QiangCai commented on pull request #4100:
URL: https://github.com/apache/carbondata/pull/4100#issuecomment-819152930


   @kunal642 please review and merge


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kunal642 commented on a change in pull request #4100: [CARBONDATA-4138] Reordering Carbon Expression instead of Spark Filter

GitBox
In reply to this post by GitBox

kunal642 commented on a change in pull request #4100:
URL: https://github.com/apache/carbondata/pull/4100#discussion_r612943739



##########
File path: core/src/main/java/org/apache/carbondata/core/scan/expression/optimize/ExpressionOptimizer.java
##########
@@ -0,0 +1,45 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.scan.expression.optimize;
+
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.scan.expression.Expression;
+import org.apache.carbondata.core.scan.expression.optimize.reorder.ExpressionReorder;
+import org.apache.carbondata.core.util.CarbonProperties;
+
+/**
+ * optimize Carbon Expression
+ */
+public class ExpressionOptimizer {
+
+  private final OptimizeRule[] rules = { new ExpressionReorder() };

Review comment:
       no need for private static class, make this final and use directly

##########
File path: core/src/main/java/org/apache/carbondata/core/scan/expression/optimize/reorder/StorageOrdinal.java
##########
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.scan.expression.optimize.reorder;
+
+import java.util.Map;
+
+import org.apache.carbondata.core.scan.expression.Expression;
+
+/**
+ *

Review comment:
       remove the empty description

##########
File path: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
##########
@@ -2602,6 +2602,11 @@ private CarbonCommonConstants() {
 
   public static final String FILE_HEADER = "fileHeader";
 
+  @CarbonProperty(dynamicConfigurable = true)
+  public static final String CARBON_OPTIMIZE_FILTER = "carbon.optimize.filter";

Review comment:
       "carbon.reorder.filter" and "carbon.optimize.filter" seem to be doing the same thing. please remove one of them

##########
File path: integration/spark/src/test/scala/org/apache/spark/carbondata/query/TestFilterReordering.scala
##########
@@ -0,0 +1,130 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.carbondata.query
+
+import java.util
+
+import org.apache.spark.sql.{CarbonEnv, CarbonThreadUtil}
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.scan.expression.{ColumnExpression, Expression, LiteralExpression}
+import org.apache.carbondata.core.scan.expression.conditional.EqualToExpression
+import org.apache.carbondata.core.scan.expression.logical.{AndExpression, OrExpression}
+import org.apache.carbondata.core.scan.expression.optimize.ExpressionOptimizer
+
+class TestFilterReordering extends QueryTest with BeforeAndAfterAll {
+
+  override protected def beforeAll(): Unit = {
+    sql("drop table if exists filter_reorder")
+    sql("create table filter_reorder(one string, two string, three string, four int, " +
+      "five int) stored as carbondata")
+  }
+
+  test("Test filter reorder with various conditions") {
+    checkOptimizer("(four = 11 and two = 11) or (one = 11)",

Review comment:
       why pass the filter expression as string instead of Expression object?? No need for translation.
   
   Refer: https://github.com/apache/carbondata/pull/3902/files#diff-ddfc2bc65b7f1055d4ae72fdfcdad5418e44373a6391fc348568b9a8bee506f6

##########
File path: core/src/main/java/org/apache/carbondata/core/scan/expression/optimize/reorder/ExpressionReorder.java
##########
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.scan.expression.optimize.reorder;
+
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.metadata.schema.table.column.CarbonColumn;
+import org.apache.carbondata.core.scan.expression.Expression;
+import org.apache.carbondata.core.scan.expression.optimize.OptimizeRule;
+import org.apache.carbondata.core.util.CarbonProperties;
+
+/**
+ * reorder Expression by storage order
+ */
+public class ExpressionReorder extends OptimizeRule {
+
+  @Override
+  public Expression optimize(CarbonTable table, Expression expression) {
+    if (!CarbonProperties.isFilterReorderingEnabled()) {
+      return expression;
+    }
+    MultiExpression multiExpression = MultiExpression.build(expression);
+    // unsupported expression
+    if (multiExpression == null) {

Review comment:
       move this check above  MultiExpression.build so that we dont enter the reorder code if null

##########
File path: core/src/main/java/org/apache/carbondata/core/scan/expression/optimize/reorder/ExpressionWithOrdinal.java
##########
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.scan.expression.optimize.reorder;
+
+import java.util.List;
+import java.util.Map;
+
+import org.apache.carbondata.core.scan.expression.ColumnExpression;
+import org.apache.carbondata.core.scan.expression.Expression;
+import org.apache.carbondata.core.scan.expression.UnknownExpression;
+import org.apache.carbondata.core.scan.expression.conditional.ConditionalExpression;
+
+/**
+ * a wrapper class of Expression with storage ordinal
+ */
+public class ExpressionWithOrdinal extends StorageOrdinal {

Review comment:
       all classes inside optimize package except ExpressionReorder should have default access modifier so that they are not used for creating expressions by mistake




--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #4100: [CARBONDATA-4138] Reordering Carbon Expression instead of Spark Filter

GitBox
In reply to this post by GitBox

QiangCai commented on a change in pull request #4100:
URL: https://github.com/apache/carbondata/pull/4100#discussion_r615580115



##########
File path: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
##########
@@ -2602,6 +2602,11 @@ private CarbonCommonConstants() {
 
   public static final String FILE_HEADER = "fileHeader";
 
+  @CarbonProperty(dynamicConfigurable = true)
+  public static final String CARBON_OPTIMIZE_FILTER = "carbon.optimize.filter";

Review comment:
       reorder is one part of optimaztion




--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #4100: [CARBONDATA-4138] Reordering Carbon Expression instead of Spark Filter

GitBox
In reply to this post by GitBox

QiangCai commented on a change in pull request #4100:
URL: https://github.com/apache/carbondata/pull/4100#discussion_r615581462



##########
File path: core/src/main/java/org/apache/carbondata/core/scan/expression/optimize/reorder/ExpressionReorder.java
##########
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.scan.expression.optimize.reorder;
+
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.metadata.schema.table.column.CarbonColumn;
+import org.apache.carbondata.core.scan.expression.Expression;
+import org.apache.carbondata.core.scan.expression.optimize.OptimizeRule;
+import org.apache.carbondata.core.util.CarbonProperties;
+
+/**
+ * reorder Expression by storage order
+ */
+public class ExpressionReorder extends OptimizeRule {
+
+  @Override
+  public Expression optimize(CarbonTable table, Expression expression) {
+    if (!CarbonProperties.isFilterReorderingEnabled()) {
+      return expression;
+    }
+    MultiExpression multiExpression = MultiExpression.build(expression);
+    // unsupported expression
+    if (multiExpression == null) {

Review comment:
       MultiExpression.build return "multiExpression "




--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


12