Apache CarbonData Dev Mailing List archive

[Discuss]Adapt MV datamap to spark 2.1 version

Classic

List

Threaded

3 messages Options

qiuchenjian

[Discuss]Adapt MV datamap to spark 2.1 version

Hi, all

It's a summary that using spark 2.1 version to run MV datamap, the all MV
test cases have passed, detail information see the MvAdaptSpark2.1, any
suggestion?

1,The Class we cann’t access in Spark 2.1 version
(1). org.apache.spark.internal.Logging
(2). org.apache.spark.sql.internal.SQLConf
Solution:Create class extends above classed in spark2.1 modular

2,The Class that Spark 2.1 version doesn’t have
(1). org.apache.spark.sql.catalyst.plans.logical.Subquery
(2). org.apache.spark.sql.catalyst.catalog.interface.HiveTableRelation
Solution: Use CatalogRelation instead and don’t use (in
LogicalPlanSignatureGenerator)
Mv the Subquery code to carbon project

3,The method that we can’t access in Spark 2.1 version
(1). sparkSession.sessionState.catalog.lookupRelation
Solution: Solution:Add this method to CarbonToSparkAdapter in spark2.1
modular

4,The changes of some class
(1). org.apache.spark.sql.catalyst.expressions.SortOrder
(2). org.apache.spark.sql.catalyst.expressions.Cast
(3). org.apache.spark.sql.catalyst.plans.Statistics
Solution: Adapt the new interface
5,The method that Spark 2.1 version doesn’t have
(1). normalizeExprId,canonicalized of
org.apache.spark.sql.catalyst.plans.QueryPlan
(2). CASE_SENSITIVE of SQLConf
(3). STARSCHEMA_DETECTION of SQLConf
Solution:Don’t use normalize , canonicalize and the CASE_SENSITIVE,
STARSCHEMA_DETECTION

6,Some logicplan optimization rules that Spark 2.1 version doesn’t have
(1). SimplifyCreateMapOps
(2). SimplifyCreateArrayOps
(3). SimplifyCreateStructOps
(4). RemoveRedundantProject
(5). RemoveRedundantAliases
(6). PullupCorrelatedPredicates
(7). ReplaceDeduplicateWithAggregate
(8). EliminateView
Solution: Delete or move the code to carbon project

7,Generate the instance in SparkSQLUtil to adapt Spark 2.1 version

8,Query SQL pass the MV check in Spark 2.1 version(CarbonSessionState)

MvAdaptSpark2.pdf
<http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/file/t378/MvAdaptSpark2.pdf>

--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

akashrn5

Re: [Discuss]Adapt MV datamap to spark 2.1 version

hi,

is the changes intrusive for support to 2.1 or you are going to use the
decoupling strategy?
I hope decoupling will be better as once we decide to remove 2.1 from
carbondata code, it will be easy to remove.

Thanks

--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

qiuchenjian

Re: [Discuss]Adapt MV datamap to spark 2.1 version

Hi, Akash R

If MV datamap supports spark 2.1, we must modify the mv internal code.

But mv datamap framework is independent， it's decoupled with spark 2.1

if we decide to remove 2.1 , mv will still be work with spark-2.2 or above
without modify

--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/