Apache CarbonData Dev Mailing List archive

Adapt to SparkSessionExtensions

Classic

List

Threaded

8 messages Options

Ajith shetty

Adapt to SparkSessionExtensions

Hi Community

From https://issues.apache.org/jira/browse/SPARK-18127 Spark provides SparkSessionExtensions in order to extended capabilities of spark. Carbon can use this in order to avoid the tight coupling due to CarbonSession in spark environment.
https://spark.apache.org/docs/2.4.3/api/java/org/apache/spark/sql/SparkSessionExtensions.html

Main Scope:
1. Compatible with Spark 2.3.2+
2. Make Carbon Parser Pluggable
a. Move to antlr4 based parser
3. Make Analyzer Rules Pluggable
4. Make Optimizer Rules Pluggable
5. Make Planning Strategies Pluggable

We can have Sub jiras in order to cover all the scenarios due to this. Please input your thoughts.

Regards

Jacky Li-3

Re: Adapt to SparkSessionExtensions

+1

And since we are starting this refactory for CarbonData 2.0 which is a major version upgrade, I suggest to consider optimize following features:
1. make global dictionary obsolete so that planning phase is cleaner. After spark tungsten project, actually the benefit get from global dictionary is not much.
2. make "stored by" syntax obsolete, thus making CREATE TABLE DDL fully comply to Hive and SparkSQL syntax, keeping only "stored as" and "using" syntax.

Regards,
Jacky

On 2019/08/22 04:58:53, Ajith shetty <[hidden email]> wrote:

> Hi Community
>
> From https://issues.apache.org/jira/browse/SPARK-18127 Spark provides SparkSessionExtensions in order to extended capabilities of spark. Carbon can use this in order to avoid the tight coupling due to CarbonSession in spark environment.
> https://spark.apache.org/docs/2.4.3/api/java/org/apache/spark/sql/SparkSessionExtensions.html
>
> Main Scope:
> 1. Compatible with Spark 2.3.2+
> 2. Make Carbon Parser Pluggable
> a. Move to antlr4 based parser
> 3. Make Analyzer Rules Pluggable
> 4. Make Optimizer Rules Pluggable
> 5. Make Planning Strategies Pluggable
>
> We can have Sub jiras in order to cover all the scenarios due to this. Please input your thoughts.
>
> Regards
>

Jacky Li-3

Re: Adapt to SparkSessionExtensions

In reply to this post by Ajith shetty

I have created branch-2.0, let's work on this feature in this branch.

Regards,
Jacky

On 2019/08/22 04:58:53, Ajith shetty <[hidden email]> wrote:

David CaiQiang

Re: Adapt to SparkSessionExtensions

In reply to this post by Ajith shetty

+1

regards

David QiangCai

Best Regards
David Cai

ravipesala

Re: Adapt to SparkSessionExtensions

In reply to this post by Jacky Li-3

Hi,

I think it is better to work on the master branch instead of 2.0 branch. It
will avoid the rebase cost and unnecessary confusion. it is better to go
with proper version quality.

Regards,
Ravindra.

On Mon, 26 Aug 2019 at 8:13 PM, Jacky Li <[hidden email]> wrote:

> I have created branch-2.0, let's work on this feature in this branch.
>
> Regards,
> Jacky
>
> On 2019/08/22 04:58:53, Ajith shetty <[hidden email]> wrote:
> > Hi Community
> >
> > From https://issues.apache.org/jira/browse/SPARK-18127 Spark provides
> SparkSessionExtensions in order to extended capabilities of spark. Carbon
> can use this in order to avoid the tight coupling due to CarbonSession in
> spark environment.
> >
> https://spark.apache.org/docs/2.4.3/api/java/org/apache/spark/sql/SparkSessionExtensions.html
> >
> > Main Scope:
> > 1. Compatible with Spark 2.3.2+
> > 2. Make Carbon Parser Pluggable
> > a. Move to antlr4 based parser
> > 3. Make Analyzer Rules Pluggable
> > 4. Make Optimizer Rules Pluggable
> > 5. Make Planning Strategies Pluggable
> >
> > We can have Sub jiras in order to cover all the scenarios due to this.
> Please input your thoughts.
> >
> > Regards
> >
>

--
Thanks & Regards,
Ravi

kumarvishal09

Re: Adapt to SparkSessionExtensions

+1
Regards
Kumar Vishal

On Sat, 31 Aug 2019 at 14:37, Ravindra Pesala <[hidden email]> wrote:

> Hi,
>
> I think it is better to work on the master branch instead of 2.0 branch. It
> will avoid the rebase cost and unnecessary confusion. it is better to go
> with proper version quality.
>
> Regards,
> Ravindra.
>
> On Mon, 26 Aug 2019 at 8:13 PM, Jacky Li <[hidden email]> wrote:
>
> > I have created branch-2.0, let's work on this feature in this branch.
> >
> > Regards,
> > Jacky
> >
> > On 2019/08/22 04:58:53, Ajith shetty <[hidden email]> wrote:
> > > Hi Community
> > >
> > > From https://issues.apache.org/jira/browse/SPARK-18127 Spark provides
> > SparkSessionExtensions in order to extended capabilities of spark. Carbon
> > can use this in order to avoid the tight coupling due to CarbonSession in
> > spark environment.
> > >
> >
> https://spark.apache.org/docs/2.4.3/api/java/org/apache/spark/sql/SparkSessionExtensions.html
> > >
> > > Main Scope:
> > > 1. Compatible with Spark 2.3.2+
> > > 2. Make Carbon Parser Pluggable
> > > a. Move to antlr4 based parser
> > > 3. Make Analyzer Rules Pluggable
> > > 4. Make Optimizer Rules Pluggable
> > > 5. Make Planning Strategies Pluggable
> > >
> > > We can have Sub jiras in order to cover all the scenarios due to this.
> > Please input your thoughts.
> > >
> > > Regards
> > >
> >
> --
> Thanks & Regards,
> Ravi
>

kumar vishal

xm_zzc

Re: Adapt to SparkSessionExtensions

In reply to this post by ravipesala

+1

Zhichao Zhang

------------------ Original ------------------
From: "ravipesala [via Apache CarbonData Dev Mailing List archive]"<[hidden email]>;
Date: Sat, Aug 31, 2019 05:11 PM
To: "恩爸"<[hidden email]>;

Subject: Re: Adapt to SparkSessionExtensions

Hi,

I think it is better to work on the master branch instead of 2.0 branch. It
will avoid the rebase cost and unnecessary confusion. it is better to go
with proper version quality.

Regards,
Ravindra.

On Mon, 26 Aug 2019 at 8:13 PM, Jacky Li <[hidden email]> wrote:

--
Thanks & Regards,
Ravi

If you reply to this email, your message will be added to the discussion below:
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Adapt-to-SparkSessionExtensions-tp83668p83947.html
To start a new topic under Apache CarbonData Dev Mailing List archive, email [hidden email]
To unsubscribe from Apache CarbonData Dev Mailing List archive, click here.
NAML

chetdb

Re: Adapt to SparkSessionExtensions

In reply to this post by Ajith shetty

Hi Ajith,

1. Now if the config includes spark extensions can we configure the carbon store path and metastoreDB path while creating spark session instance
2. Impact on the existing carbonsession user for spark 2.3.x user mentions that there will be "no profiler and no refresh across driver". Could these points be elaborated in detail ?

Regards
Chetan

On 2019/08/22 04:58:53, Ajith shetty <[hidden email]> wrote: