GitHub user ravipesala opened a pull request:
https://github.com/apache/carbondata/pull/2444 [CARBONDATA-2686] Implement Left join on MV datamap Code ported from another branch done by @eychih to implement Left outer join. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravipesala/incubator-carbondata left-join Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2444.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2444 ---- commit 354cc01a1be116efbba4f1e5fecf97f76ce3ee61 Author: eychih <eychih@...> Date: 2018-06-01T02:10:34Z add matching LEFTJOIN VIEW commit 07828da6834290e713d45fe4acc81a61de3656b3 Author: eychih <eychih@...> Date: 2018-06-07T01:14:24Z add changes for advisor to recommend leftjoin view ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2444 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6748/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2444 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5594/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2444 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5578/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2444 retest this please --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2444 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6824/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2444 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5622/ --- |
In reply to this post by qiuchenjian-2
Github user brijoobopanna commented on the issue:
https://github.com/apache/carbondata/pull/2444 retest this please --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2444 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6841/ --- |
In reply to this post by qiuchenjian-2
Github user brijoobopanna commented on the issue:
https://github.com/apache/carbondata/pull/2444 retest sdv please --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2444 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5640/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2444 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5630/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2444 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5639/ --- |
In reply to this post by qiuchenjian-2
Github user brijoobopanna commented on the issue:
https://github.com/apache/carbondata/pull/2444 retest this please --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2444 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6867/ --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2444#discussion_r200560651 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/rewrite/DefaultMatchMaker.scala --- @@ -127,8 +126,27 @@ object SelectSelectNoChildDelta extends DefaultMatchPattern with PredicateHelper } } - def apply( - subsumer: ModularPlan, + private def isLeftJoinView(subsumer: ModularPlan): Boolean = { + if (subsumer.isInstanceOf[modular.Select]) { + val sel = subsumer.asInstanceOf[modular.Select] --- End diff -- better to use match clause to simply the logic in this function --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2444#discussion_r200560747 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/rewrite/DefaultMatchMaker.scala --- @@ -245,15 +269,22 @@ object SelectSelectNoChildDelta extends DefaultMatchPattern with PredicateHelper } } val tPredicateList = sel_1q.predicateList.filter { p => - !sel_1a.predicateList.exists(_.semanticEquals(p)) } - val wip = sel_1q.copy( - predicateList = tPredicateList, - children = tChildren, - joinEdges = tJoinEdges.filter(_ != null), - aliasMap = tAliasMap.toMap) - - val done = factorOutSubsumer(wip, usel_1a, wip.aliasMap) - Seq(done) + !sel_1a.predicateList.exists(_.semanticEquals(p)) + } ++ (if (isLeftJoinView(sel_1a) && + sel_1q.joinEdges.head.joinType == Inner) { + sel_1a.children(1) + .asInstanceOf[HarmonizedRelation].tag.map(IsNotNull(_)).toSeq + } else { + Seq.empty + }) + val wip = sel_1q.copy( --- End diff -- can you give a better name for this variable --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2444#discussion_r200560875 --- Diff: datamap/mv/plan/src/main/scala/org/apache/carbondata/mv/plans/modular/Harmonizer.scala --- @@ -36,17 +36,33 @@ abstract class Harmonizer(conf: SQLConf) def batches: Seq[Batch] = { Batch( "Data Harmonizations", fixedPoint, - HarmonizeDimensionTable, - HarmonizeFactTable) :: Nil + Seq( HarmonizeDimensionTable) ++ + extendedOperatorHarmonizationRules: _*) :: Nil +// HarmonizeFactTable) :: Nil --- End diff -- If it is not require, remove it --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2444#discussion_r200560931 --- Diff: datamap/mv/plan/src/main/scala/org/apache/carbondata/mv/plans/modular/Harmonizer.scala --- @@ -36,17 +36,33 @@ abstract class Harmonizer(conf: SQLConf) def batches: Seq[Batch] = { Batch( "Data Harmonizations", fixedPoint, - HarmonizeDimensionTable, - HarmonizeFactTable) :: Nil + Seq( HarmonizeDimensionTable) ++ + extendedOperatorHarmonizationRules: _*) :: Nil +// HarmonizeFactTable) :: Nil } + + /** + * Override to provide additional rules for the modular operator harmonization batch. + */ + def extendedOperatorHarmonizationRules: Seq[Rule[ModularPlan]] = Nil +} + +/** + * A full Harmonizer - harmonize both fact and dimension tables + */ +object FullHarmonizer extends FullHarmonizer + +class FullHarmonizer extends Harmonizer(new SQLConf()) { + override def extendedOperatorHarmonizationRules: Seq[Rule[ModularPlan]] = + super.extendedOperatorHarmonizationRules ++ (HarmonizeFactTable :: Nil) } /** - * An default Harmonizer + * A semi Harmonizer - harmonize dimension tables only --- End diff -- Can you add description for semi harmonizer? --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2444#discussion_r200561054 --- Diff: datamap/mv/plan/src/main/scala/org/apache/carbondata/mv/plans/modular/ModularRelation.scala --- @@ -86,10 +87,28 @@ object HarmonizedRelation { Select(_, _, _, _, _, dim :: Nil, NoFlags, Nil, Nil, _), NoFlags, Nil, _) if (dim.isInstanceOf[ModularRelation]) => - if (g.outputList - .forall(col => col.isInstanceOf[AttributeReference] || - (col.isInstanceOf[Alias] && - col.asInstanceOf[Alias].child.isInstanceOf[AttributeReference]))) { + if (g.outputList.forall(col => { --- End diff -- better to move the logic out of if condition. and have it better formatted --- |
Free forum by Nabble | Edit this page |