http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Feature-Proposal-Spark-2-integration-with-CarbonData-tp3236p3267.html
Spark2.x. As Spark2.x has major design and interface changes. It is also
challenge to support both Spark2.x and Spark1.x. We can start creating
> Hi
>
> Very excited to see that CarbonData will integrate with Spark 2.x, look
> forward to getting performance improved further and usability enhanced.
>
> Regards
> Liang
>
>
> Jacky Li wrote
> > Hi all,
> >
> > Currently CarbonData only works with spark1.5 and spark1.6, as Apache
> > Spark community is moving to 2.1, more and more user will deploy spark
> 2.x
> > in production environment. In order to make CarbonData even more popular,
> > I think now it is good time to start considering spark2.x integration
> with
> > CarbonData.
> >
> > Moreover, we can take this as a chance to refactory CarbonData to make it
> > both easier to use and higher performance.
> >
> > Usability:
> > Instead of using CarbonContext, in spark2 integration, user should able
> to
> > 1. use native SparkSession in the spark application to create and query
> > table backed by CarbonData files with full feature support, including
> > index and late decode optimization.
> >
> > 2. use CarbonData's API and tool to acomplish carbon specific tasks, like
> > compaction, delete segment, etc.
> >
> > Perforamnce:
> > 1. deep integration with Datasource API and leveraging spark2's whole
> > stage codegen feature.
> >
> > 2. provide implementation of vectorized record reader, to improve
> scanning
> > performance.
> >
> > Since spark2 changes a lot comparing to spark 1.6, it may take some time
> > to complete all these features. With the help of contributors and
> > committers, I hope we can have basic features working in next CarbonData
> > release.
> >
> > What do you think about this idea? All kinds of contribution and
> > suggestions are welcomed.
> >
> > Regards,
> > Jacky Li
>
>
>
>
>
> --
> View this message in context:
http://apache-carbondata-> mailing-list-archive.1130556.n5.nabble.com/Feature-
> Proposal-Spark-2-integration-with-CarbonData-tp3236p3238.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive
> at Nabble.com.
>