[Discuss]Adapt MV datamap to spark 2.1 version

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[Discuss]Adapt MV datamap to spark 2.1 version

qiuchenjian
Hi, all

It's a summary that using spark 2.1 version to run MV datamap,  the all MV
test cases have passed, detail information see the MvAdaptSpark2.1, any
suggestion?
 
1,The Class we cann’t access in Spark 2.1 version
    (1). org.apache.spark.internal.Logging
    (2). org.apache.spark.sql.internal.SQLConf
  Solution:Create class extends above classed in spark2.1 modular

2,The Class that Spark 2.1 version doesn’t have
    (1). org.apache.spark.sql.catalyst.plans.logical.Subquery
    (2). org.apache.spark.sql.catalyst.catalog.interface.HiveTableRelation
  Solution: Use CatalogRelation instead and don’t use (in
LogicalPlanSignatureGenerator)
                Mv the Subquery code to carbon project

3,The method that we can’t access in Spark 2.1 version
    (1). sparkSession.sessionState.catalog.lookupRelation
  Solution: Solution:Add this method to CarbonToSparkAdapter in spark2.1
modular

4,The changes of some class
    (1). org.apache.spark.sql.catalyst.expressions.SortOrder
    (2). org.apache.spark.sql.catalyst.expressions.Cast
    (3). org.apache.spark.sql.catalyst.plans.Statistics
  Solution: Adapt the new interface
5,The method that Spark 2.1 version doesn’t have
    (1). normalizeExprId,canonicalized of
org.apache.spark.sql.catalyst.plans.QueryPlan
    (2). CASE_SENSITIVE of SQLConf
    (3). STARSCHEMA_DETECTION of SQLConf
  Solution:Don’t use normalize , canonicalize and the CASE_SENSITIVE,
STARSCHEMA_DETECTION

6,Some logicplan optimization rules that Spark 2.1 version doesn’t have
    (1). SimplifyCreateMapOps
    (2). SimplifyCreateArrayOps
    (3). SimplifyCreateStructOps
    (4). RemoveRedundantProject
    (5). RemoveRedundantAliases
    (6). PullupCorrelatedPredicates
    (7). ReplaceDeduplicateWithAggregate
    (8). EliminateView
  Solution: Delete or move the code to carbon project

7,Generate the instance in SparkSQLUtil to adapt Spark 2.1 version

8,Query SQL pass the MV check in Spark 2.1 version(CarbonSessionState)

MvAdaptSpark2.pdf
<http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/file/t378/MvAdaptSpark2.pdf>  



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: [Discuss]Adapt MV datamap to spark 2.1 version

akashrn5
hi,

is the changes intrusive for support to 2.1 or you are going to use the
decoupling strategy?
I hope decoupling will be better as once we decide to remove 2.1 from
carbondata code, it will be easy to remove.


Thanks



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: [Discuss]Adapt MV datamap to spark 2.1 version

qiuchenjian
Hi, Akash R

If MV datamap supports spark 2.1, we must modify the mv internal code.

But mv datamap framework is independent, it's decoupled with spark 2.1

if we decide to remove 2.1 , mv will still be work with spark-2.2 or above
without modify




--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/