[GitHub] [carbondata] xiaohui0318 opened a new pull request #3557: [CARBONDATA-3649] Hive expression is pushed down to carbon

classic Classic list List threaded Threaded
48 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] dhatchayani commented on a change in pull request #3557: [CARBONDATA-3649] Hive expression is pushed down to carbon

GitBox
dhatchayani commented on a change in pull request #3557: [CARBONDATA-3649] Hive expression is pushed down to carbon
URL: https://github.com/apache/carbondata/pull/3557#discussion_r363656690
 
 

 ##########
 File path: hadoop/src/test/java/org/apache/carbondata/hadoop/ft/Hive2CarbonExpressionTest.java
 ##########
 @@ -0,0 +1,354 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.hadoop.ft;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.List;
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.util.CarbonProperties;
+import org.apache.carbondata.hadoop.api.CarbonFileInputFormat;
+import org.apache.carbondata.hadoop.testutil.StoreCreator;
+import org.apache.carbondata.processing.loading.model.CarbonLoadModel;
+
+import com.google.common.collect.Lists;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hive.ql.exec.Utilities;
+import org.apache.hadoop.hive.ql.plan.ExprNodeColumnDesc;
+import org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc;
+import org.apache.hadoop.hive.ql.plan.ExprNodeDesc;
+import org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc;
+import org.apache.hadoop.hive.ql.plan.TableScanDesc;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFIn;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPAnd;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqual;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqualOrGreaterThan;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqualOrLessThan;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPGreaterThan;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPLessThan;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPNotEqual;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPNotNull;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPNull;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPOr;
+import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoFactory;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapreduce.InputSplit;
+import org.apache.hadoop.mapreduce.Job;
+import org.junit.Assert;
+import org.junit.Test;
+
+/**
+ * @program carbondata
+ * @description: test hive expression to carbondata expression filter
+ * @author: xiaohui
+ * @create: 2020/01/01 15:27
+ */
+
+public class Hive2CarbonExpressionTest {
+  private static StoreCreator creator;
+  private static CarbonLoadModel loadModel;
+  private static CarbonTable table;
+  static {
 
 Review comment:
   can you please add the usage of the property "hive.optimize.index.filter" in the hive document?
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] dhatchayani commented on a change in pull request #3557: [CARBONDATA-3649] Hive expression is pushed down to carbon

GitBox
In reply to this post by GitBox
dhatchayani commented on a change in pull request #3557: [CARBONDATA-3649] Hive expression is pushed down to carbon
URL: https://github.com/apache/carbondata/pull/3557#discussion_r363657545
 
 

 ##########
 File path: hadoop/src/test/java/org/apache/carbondata/hadoop/ft/Hive2CarbonExpressionTest.java
 ##########
 @@ -0,0 +1,354 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.hadoop.ft;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.List;
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.util.CarbonProperties;
+import org.apache.carbondata.hadoop.api.CarbonFileInputFormat;
+import org.apache.carbondata.hadoop.testutil.StoreCreator;
+import org.apache.carbondata.processing.loading.model.CarbonLoadModel;
+
+import com.google.common.collect.Lists;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hive.ql.exec.Utilities;
+import org.apache.hadoop.hive.ql.plan.ExprNodeColumnDesc;
+import org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc;
+import org.apache.hadoop.hive.ql.plan.ExprNodeDesc;
+import org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc;
+import org.apache.hadoop.hive.ql.plan.TableScanDesc;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFIn;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPAnd;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqual;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqualOrGreaterThan;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqualOrLessThan;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPGreaterThan;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPLessThan;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPNotEqual;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPNotNull;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPNull;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPOr;
+import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoFactory;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapreduce.InputSplit;
+import org.apache.hadoop.mapreduce.Job;
+import org.junit.Assert;
+import org.junit.Test;
+
+/**
+ * @program carbondata
+ * @description: test hive expression to carbondata expression filter
+ * @author: xiaohui
+ * @create: 2020/01/01 15:27
+ */
+
+public class Hive2CarbonExpressionTest {
+  private static StoreCreator creator;
+  private static CarbonLoadModel loadModel;
+  private static CarbonTable table;
+  static {
+    CarbonProperties.getInstance().
+        addProperty(CarbonCommonConstants.CARBON_BADRECORDS_LOC, "/tmp/carbon/badrecords");
+    CarbonProperties.getInstance()
+        .addProperty(CarbonCommonConstants.CARBON_SYSTEM_FOLDER_LOCATION, "/tmp/carbon/");
+    CarbonProperties.getInstance()
+        .addProperty(CarbonCommonConstants.CARBON_WRITTEN_BY_APPNAME, "Hive2CarbonExpressionTest");
+    try {
+      creator = new StoreCreator(new File("target/store").getAbsolutePath(),
 
 Review comment:
   where exactly we have used hive.optimize.index.filter this property while testing?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] dhatchayani commented on a change in pull request #3557: [CARBONDATA-3649] Hive expression is pushed down to carbon

GitBox
In reply to this post by GitBox
dhatchayani commented on a change in pull request #3557: [CARBONDATA-3649] Hive expression is pushed down to carbon
URL: https://github.com/apache/carbondata/pull/3557#discussion_r363667102
 
 

 ##########
 File path: hadoop/src/main/java/org/apache/carbondata/hadoop/util/Hive2CarbonExpression.java
 ##########
 @@ -0,0 +1,183 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.hadoop.util;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.carbondata.core.scan.expression.ColumnExpression;
+import org.apache.carbondata.core.scan.expression.Expression;
+import org.apache.carbondata.core.scan.expression.LiteralExpression;
+import org.apache.carbondata.core.scan.expression.conditional.EqualToExpression;
+import org.apache.carbondata.core.scan.expression.conditional.GreaterThanEqualToExpression;
+import org.apache.carbondata.core.scan.expression.conditional.GreaterThanExpression;
+import org.apache.carbondata.core.scan.expression.conditional.InExpression;
+import org.apache.carbondata.core.scan.expression.conditional.LessThanEqualToExpression;
+import org.apache.carbondata.core.scan.expression.conditional.LessThanExpression;
+import org.apache.carbondata.core.scan.expression.conditional.ListExpression;
+import org.apache.carbondata.core.scan.expression.conditional.NotEqualsExpression;
+import org.apache.carbondata.core.scan.expression.logical.AndExpression;
+import org.apache.carbondata.core.scan.expression.logical.OrExpression;
+import org.apache.carbondata.hadoop.api.CarbonInputFormat;
+
+import org.apache.hadoop.hive.ql.plan.ExprNodeDesc;
+import org.apache.hadoop.hive.ql.plan.ExprNodeFieldDesc;
+import org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDF;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFIn;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPAnd;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqual;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqualOrGreaterThan;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqualOrLessThan;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPGreaterThan;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPLessThan;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPNotEqual;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPNotNull;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPNull;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPOr;
+import org.apache.log4j.Logger;
+
+/**
+ * @description: hive expression to carbon expression
+ */
+public class Hive2CarbonExpression {
+  public static final int left = 0;
+  public static final int right = 1;
+  private static final Logger LOG =
+      LogServiceFactory.getLogService(CarbonInputFormat.class.getName());
+
+  public static Expression convertExprHive2Carbon(ExprNodeDesc exprNodeDesc) {
+    if (exprNodeDesc instanceof ExprNodeGenericFuncDesc) {
+      ExprNodeGenericFuncDesc exprNodeGenericFuncDesc = (ExprNodeGenericFuncDesc) exprNodeDesc;
+      GenericUDF udf = exprNodeGenericFuncDesc.getGenericUDF();
+      List<ExprNodeDesc> ll = exprNodeGenericFuncDesc.getChildren();
+      if (udf instanceof GenericUDFIn) {
+        ColumnExpression columnExpression = new ColumnExpression(ll.get(left).getCols().get(left),
+            getDateType(ll.get(left).getTypeString()));
+        List<Expression> listExpr = new ArrayList<>();
+        for (int i = right; i < ll.size(); i++) {
+          LiteralExpression literalExpression = new LiteralExpression(ll.get(i).getExprString(),
+              getDateType(ll.get(left).getTypeString()));
+          listExpr.add(literalExpression);
+        }
+        ListExpression listExpression = new ListExpression(listExpr);
+        return new InExpression(columnExpression, listExpression);
+      } else if (udf instanceof GenericUDFOPOr) {
+        Expression leftExpression =
+            convertExprHive2Carbon(exprNodeGenericFuncDesc.getChildren().get(left));
+        Expression rightExpression =
+            convertExprHive2Carbon(exprNodeGenericFuncDesc.getChildren().get(right));
+        return new OrExpression(leftExpression, rightExpression);
+      } else if (udf instanceof GenericUDFOPAnd) {
+        Expression leftExpression =
+            convertExprHive2Carbon(exprNodeGenericFuncDesc.getChildren().get(left));
+        Expression rightExpression =
+            convertExprHive2Carbon(exprNodeGenericFuncDesc.getChildren().get(right));
+        return new AndExpression(leftExpression, rightExpression);
+
+      } else if (udf instanceof GenericUDFOPEqual) {
+        ColumnExpression columnExpression = null;
+        if (ll.get(left) instanceof ExprNodeFieldDesc) {
 
 Review comment:
   will this handle for all the complex date types like STRUCT, ARRAY and MAP?
   Please add test cases where it tries to create filter expression for all the complex data types.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3557: [CARBONDATA-3649] Hive expression is pushed down to carbon

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3557: [CARBONDATA-3649] Hive expression is pushed down to carbon
URL: https://github.com/apache/carbondata/pull/3557#issuecomment-571654660
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1509/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] xiaohui0318 commented on issue #3557: [CARBONDATA-3649] Hive expression is pushed down to carbon

GitBox
In reply to this post by GitBox
xiaohui0318 commented on issue #3557: [CARBONDATA-3649] Hive expression is pushed down to carbon
URL: https://github.com/apache/carbondata/pull/3557#issuecomment-571685321
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3557: [CARBONDATA-3649] Hive expression is pushed down to carbon

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3557: [CARBONDATA-3649] Hive expression is pushed down to carbon
URL: https://github.com/apache/carbondata/pull/3557#issuecomment-571718267
 
 
   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1514/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on issue #3557: [CARBONDATA-3649] Hive expression is pushed down to carbon

GitBox
In reply to this post by GitBox
jackylk commented on issue #3557: [CARBONDATA-3649] Hive expression is pushed down to carbon
URL: https://github.com/apache/carbondata/pull/3557#issuecomment-571931456
 
 
   LGTM

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] asfgit closed pull request #3557: [CARBONDATA-3649] Hive expression is pushed down to carbon

GitBox
In reply to this post by GitBox
asfgit closed pull request #3557: [CARBONDATA-3649] Hive expression is pushed down to carbon
URL: https://github.com/apache/carbondata/pull/3557
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
123