GitHub user mohammadshahidkhan opened a pull request:
https://github.com/apache/carbondata/pull/1647 [CARBONDATA-1887] block pruning not happening is carbon for ShortType and SmallIntType columns ⦠Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [X ] Any interfaces changed? NA - [X] Any backward compatibility impacted? NA - [X] Document update required? NA - [X] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. Done manual testing - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/mohammadshahidkhan/incubator-carbondata block_pruning_fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1647.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1647 ---- commit 4cbae0b7c87d0680b5afa1517da5b0841d58fd92 Author: mohammadshahidkhan <[hidden email]> Date: 2017-12-12T12:39:51Z [CARBONDATA-1887] block pruning not happening is carbon for ShortType and SmallIntType columns ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1647 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/667/ --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1647#discussion_r156363228 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/CastExpressionOptimization.scala --- @@ -367,6 +474,13 @@ object CastExpressionOptimization { } else { Some(CastExpr(c)) } + case s: ShortType if t.sameType(IntegerType) => + val value = v.asInstanceOf[Integer].toShort + if (value.toShort.equals(v)) { --- End diff -- change it to toInt --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1647 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1896/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1647 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2227/ --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1647#discussion_r156387038 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/CastExpressionOptimization.scala --- @@ -129,6 +143,13 @@ object CastExpressionOptimization { } else { Some(CastExpr(c)) } + case s: ShortType if t.sameType(IntegerType) => --- End diff -- almost code is same in each case please try to refactor --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1647 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1899/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1647 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/670/ --- |
In reply to this post by qiuchenjian-2
Github user mohammadshahidkhan commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1647#discussion_r156401253 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/CastExpressionOptimization.scala --- @@ -367,6 +474,13 @@ object CastExpressionOptimization { } else { Some(CastExpr(c)) } + case s: ShortType if t.sameType(IntegerType) => + val value = v.asInstanceOf[Integer].toShort + if (value.toShort.equals(v)) { --- End diff -- Fixed --- |
In reply to this post by qiuchenjian-2
Github user mohammadshahidkhan commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1647#discussion_r156401352 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/CastExpressionOptimization.scala --- @@ -129,6 +143,13 @@ object CastExpressionOptimization { } else { Some(CastExpr(c)) } + case s: ShortType if t.sameType(IntegerType) => --- End diff -- Refactored --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1647 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2238/ --- |
In reply to this post by qiuchenjian-2
Github user mohammadshahidkhan commented on the issue:
https://github.com/apache/carbondata/pull/1647 retest this please --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1647 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/680/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1647 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1909/ --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1647#discussion_r156671241 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/CastExpressionOptimization.scala --- @@ -343,32 +294,106 @@ object CastExpressionOptimization { Some(CastExpr(c)) } case i: IntegerType if t.sameType(DoubleType) => - val value = v.asInstanceOf[Double].toInt - if (value.toDouble.equals(v)) { - Some(sources.LessThanOrEqual(a.name, value)) - } else { - Some(CastExpr(c)) - } + updateFilterForInt(v, c) + case s: ShortType if t.sameType(IntegerType) => + updateFilterForShort(v, c) case _ => Some(CastExpr(c)) } case c@LessThanOrEqual(Literal(v, t), Cast(a: Attribute, _)) => a.dataType match { case ts: TimestampType if t.sameType(StringType) => - val value = typeCastStringToLong(v) - if (!value.equals(v)) { - Some(sources.GreaterThanOrEqual(a.name, value)) - } else { - Some(CastExpr(c)) - } + updateFilterForTimeStamp(v, c) case i: IntegerType if t.sameType(DoubleType) => - val value = v.asInstanceOf[Double].toInt - if (value.toDouble.equals(v)) { - Some(sources.GreaterThanOrEqual(a.name, value)) - } else { - Some(CastExpr(c)) - } + updateFilterForInt(v, c) + case s: ShortType if t.sameType(IntegerType) => + updateFilterForShort(v, c) case _ => Some(CastExpr(c)) } } } + + /** + * the method removes the cast for short type columns + * @param actualValue + * @param exp + * @return + */ + def updateFilterForShort(actualValue: Any, exp: Expression): Option[sources.Filter] = { + val newValue = actualValue.asInstanceOf[Integer].toShort + if (newValue.toInt.equals(actualValue)) { + updateFilterBasedOnFilterType(exp, newValue) + } else { + Some(CastExpr(exp)) + } + } + + /** + * the method removes the cast for int type columns + * + * @param actualValue + * @param exp + * @return + */ + def updateFilterForInt(actualValue: Any, exp: Expression): Option[sources.Filter] = { + val newValue = actualValue.asInstanceOf[Double].toInt + if (newValue.toDouble.equals(actualValue)) { + updateFilterBasedOnFilterType(exp, newValue) + } else { + Some(CastExpr(exp)) + } + } + + /** + * the method removes the cast for timestamp type columns + * + * @param actualValue + * @param exp + * @return + */ + def updateFilterForTimeStamp(actualValue: Any, exp: Expression): Option[sources.Filter] = { + val newValue = typeCastStringToLong(actualValue) + if (!newValue.equals(actualValue)) { + updateFilterBasedOnFilterType(exp, newValue) + } else { + Some(CastExpr(exp)) + } + } + + /** + * the method removes the cast for the respective filter type + * + * @param exp + * @param newValue + * @return + */ + def updateFilterBasedOnFilterType(exp: Expression, --- End diff -- Is is possible to handle this method in checkifcastcanberemove?? --- |
In reply to this post by qiuchenjian-2
Github user mohammadshahidkhan commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1647#discussion_r156671548 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/CastExpressionOptimization.scala --- @@ -343,32 +294,106 @@ object CastExpressionOptimization { Some(CastExpr(c)) } case i: IntegerType if t.sameType(DoubleType) => - val value = v.asInstanceOf[Double].toInt - if (value.toDouble.equals(v)) { - Some(sources.LessThanOrEqual(a.name, value)) - } else { - Some(CastExpr(c)) - } + updateFilterForInt(v, c) + case s: ShortType if t.sameType(IntegerType) => + updateFilterForShort(v, c) case _ => Some(CastExpr(c)) } case c@LessThanOrEqual(Literal(v, t), Cast(a: Attribute, _)) => a.dataType match { case ts: TimestampType if t.sameType(StringType) => - val value = typeCastStringToLong(v) - if (!value.equals(v)) { - Some(sources.GreaterThanOrEqual(a.name, value)) - } else { - Some(CastExpr(c)) - } + updateFilterForTimeStamp(v, c) case i: IntegerType if t.sameType(DoubleType) => - val value = v.asInstanceOf[Double].toInt - if (value.toDouble.equals(v)) { - Some(sources.GreaterThanOrEqual(a.name, value)) - } else { - Some(CastExpr(c)) - } + updateFilterForInt(v, c) + case s: ShortType if t.sameType(IntegerType) => + updateFilterForShort(v, c) case _ => Some(CastExpr(c)) } } } + + /** + * the method removes the cast for short type columns + * @param actualValue + * @param exp + * @return + */ + def updateFilterForShort(actualValue: Any, exp: Expression): Option[sources.Filter] = { + val newValue = actualValue.asInstanceOf[Integer].toShort + if (newValue.toInt.equals(actualValue)) { + updateFilterBasedOnFilterType(exp, newValue) + } else { + Some(CastExpr(exp)) + } + } + + /** + * the method removes the cast for int type columns + * + * @param actualValue + * @param exp + * @return + */ + def updateFilterForInt(actualValue: Any, exp: Expression): Option[sources.Filter] = { + val newValue = actualValue.asInstanceOf[Double].toInt + if (newValue.toDouble.equals(actualValue)) { + updateFilterBasedOnFilterType(exp, newValue) + } else { + Some(CastExpr(exp)) + } + } + + /** + * the method removes the cast for timestamp type columns + * + * @param actualValue + * @param exp + * @return + */ + def updateFilterForTimeStamp(actualValue: Any, exp: Expression): Option[sources.Filter] = { + val newValue = typeCastStringToLong(actualValue) + if (!newValue.equals(actualValue)) { + updateFilterBasedOnFilterType(exp, newValue) + } else { + Some(CastExpr(exp)) + } + } + + /** + * the method removes the cast for the respective filter type + * + * @param exp + * @param newValue + * @return + */ + def updateFilterBasedOnFilterType(exp: Expression, --- End diff -- I think its better to handle in different method otherwise for each type we need to add if statement to handle whether updated value is of same type or not --- |
In reply to this post by qiuchenjian-2
Github user mohammadshahidkhan commented on the issue:
https://github.com/apache/carbondata/pull/1647 retest sdv please --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1647 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2266/ --- |
In reply to this post by qiuchenjian-2
Github user mohammadshahidkhan commented on the issue:
https://github.com/apache/carbondata/pull/1647 retest this please --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1647 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/720/ --- |
Free forum by Nabble | Edit this page |