Adapt to SparkSessionExtensions

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Adapt to SparkSessionExtensions

Ajith shetty
Hi Community

From https://issues.apache.org/jira/browse/SPARK-18127 Spark provides SparkSessionExtensions in order to extended capabilities of spark. Carbon can use this in order to avoid the tight coupling due to CarbonSession in spark environment.
https://spark.apache.org/docs/2.4.3/api/java/org/apache/spark/sql/SparkSessionExtensions.html

Main Scope:
1. Compatible with Spark 2.3.2+
2. Make Carbon Parser Pluggable
                a. Move to antlr4 based parser
3. Make Analyzer Rules Pluggable
4. Make Optimizer Rules Pluggable
5. Make Planning Strategies Pluggable

We can have Sub jiras in order to cover all the scenarios due to this. Please input your thoughts.

Regards
Reply | Threaded
Open this post in threaded view
|

Re: Adapt to SparkSessionExtensions

Jacky Li-3
+1

And since we are starting this refactory for CarbonData 2.0 which is a major version upgrade, I suggest to consider optimize following features:
1. make global dictionary obsolete so that planning phase is cleaner. After spark tungsten project, actually the benefit get from global dictionary is not much.
2. make "stored by" syntax obsolete, thus making CREATE TABLE DDL fully comply to Hive and SparkSQL syntax, keeping only "stored as" and "using" syntax.

Regards,
Jacky

On 2019/08/22 04:58:53, Ajith shetty <[hidden email]> wrote:

> Hi Community
>
> From https://issues.apache.org/jira/browse/SPARK-18127 Spark provides SparkSessionExtensions in order to extended capabilities of spark. Carbon can use this in order to avoid the tight coupling due to CarbonSession in spark environment.
> https://spark.apache.org/docs/2.4.3/api/java/org/apache/spark/sql/SparkSessionExtensions.html
>
> Main Scope:
> 1. Compatible with Spark 2.3.2+
> 2. Make Carbon Parser Pluggable
>                 a. Move to antlr4 based parser
> 3. Make Analyzer Rules Pluggable
> 4. Make Optimizer Rules Pluggable
> 5. Make Planning Strategies Pluggable
>
> We can have Sub jiras in order to cover all the scenarios due to this. Please input your thoughts.
>
> Regards
>
Reply | Threaded
Open this post in threaded view
|

Re: Adapt to SparkSessionExtensions

Jacky Li-3
In reply to this post by Ajith shetty
I have created branch-2.0, let's work on this feature in this branch.

Regards,
Jacky

On 2019/08/22 04:58:53, Ajith shetty <[hidden email]> wrote:

> Hi Community
>
> From https://issues.apache.org/jira/browse/SPARK-18127 Spark provides SparkSessionExtensions in order to extended capabilities of spark. Carbon can use this in order to avoid the tight coupling due to CarbonSession in spark environment.
> https://spark.apache.org/docs/2.4.3/api/java/org/apache/spark/sql/SparkSessionExtensions.html
>
> Main Scope:
> 1. Compatible with Spark 2.3.2+
> 2. Make Carbon Parser Pluggable
>                 a. Move to antlr4 based parser
> 3. Make Analyzer Rules Pluggable
> 4. Make Optimizer Rules Pluggable
> 5. Make Planning Strategies Pluggable
>
> We can have Sub jiras in order to cover all the scenarios due to this. Please input your thoughts.
>
> Regards
>
Reply | Threaded
Open this post in threaded view
|

Re: Adapt to SparkSessionExtensions

David CaiQiang
In reply to this post by Ajith shetty
+1


regards

David QiangCai

Best Regards
David Cai
Reply | Threaded
Open this post in threaded view
|

Re: Adapt to SparkSessionExtensions

ravipesala
In reply to this post by Jacky Li-3
Hi,

I think it is better to work on the master branch instead of 2.0 branch. It
will avoid the rebase cost and unnecessary confusion. it is better to go
with proper version quality.

Regards,
Ravindra.

On Mon, 26 Aug 2019 at 8:13 PM, Jacky Li <[hidden email]> wrote:

> I have created branch-2.0, let's work on this feature in this branch.
>
> Regards,
> Jacky
>
> On 2019/08/22 04:58:53, Ajith shetty <[hidden email]> wrote:
> > Hi Community
> >
> > From https://issues.apache.org/jira/browse/SPARK-18127 Spark provides
> SparkSessionExtensions in order to extended capabilities of spark. Carbon
> can use this in order to avoid the tight coupling due to CarbonSession in
> spark environment.
> >
> https://spark.apache.org/docs/2.4.3/api/java/org/apache/spark/sql/SparkSessionExtensions.html
> >
> > Main Scope:
> > 1. Compatible with Spark 2.3.2+
> > 2. Make Carbon Parser Pluggable
> >                 a. Move to antlr4 based parser
> > 3. Make Analyzer Rules Pluggable
> > 4. Make Optimizer Rules Pluggable
> > 5. Make Planning Strategies Pluggable
> >
> > We can have Sub jiras in order to cover all the scenarios due to this.
> Please input your thoughts.
> >
> > Regards
> >
>
--
Thanks & Regards,
Ravi
Reply | Threaded
Open this post in threaded view
|

Re: Adapt to SparkSessionExtensions

kumarvishal09
+1
Regards
Kumar Vishal

On Sat, 31 Aug 2019 at 14:37, Ravindra Pesala <[hidden email]> wrote:

> Hi,
>
> I think it is better to work on the master branch instead of 2.0 branch. It
> will avoid the rebase cost and unnecessary confusion. it is better to go
> with proper version quality.
>
> Regards,
> Ravindra.
>
> On Mon, 26 Aug 2019 at 8:13 PM, Jacky Li <[hidden email]> wrote:
>
> > I have created branch-2.0, let's work on this feature in this branch.
> >
> > Regards,
> > Jacky
> >
> > On 2019/08/22 04:58:53, Ajith shetty <[hidden email]> wrote:
> > > Hi Community
> > >
> > > From https://issues.apache.org/jira/browse/SPARK-18127 Spark provides
> > SparkSessionExtensions in order to extended capabilities of spark. Carbon
> > can use this in order to avoid the tight coupling due to CarbonSession in
> > spark environment.
> > >
> >
> https://spark.apache.org/docs/2.4.3/api/java/org/apache/spark/sql/SparkSessionExtensions.html
> > >
> > > Main Scope:
> > > 1. Compatible with Spark 2.3.2+
> > > 2. Make Carbon Parser Pluggable
> > >                 a. Move to antlr4 based parser
> > > 3. Make Analyzer Rules Pluggable
> > > 4. Make Optimizer Rules Pluggable
> > > 5. Make Planning Strategies Pluggable
> > >
> > > We can have Sub jiras in order to cover all the scenarios due to this.
> > Please input your thoughts.
> > >
> > > Regards
> > >
> >
> --
> Thanks & Regards,
> Ravi
>
kumar vishal
Reply | Threaded
Open this post in threaded view
|

Re: Adapt to SparkSessionExtensions

xm_zzc
In reply to this post by ravipesala
+1


Zhichao Zhang


------------------ Original ------------------
From: "ravipesala [via Apache CarbonData Dev Mailing List archive]"<[hidden email]>;
Date: Sat, Aug 31, 2019 05:11 PM
To: "恩爸"<[hidden email]>;

Subject: Re: Adapt to SparkSessionExtensions



  Hi,

I think it is better to work on the master branch instead of 2.0 branch. It
will avoid the rebase cost and unnecessary confusion. it is better to go
with proper version quality.

Regards,
Ravindra.

On Mon, 26 Aug 2019 at 8:13 PM, Jacky Li <[hidden email]> wrote:

> I have created branch-2.0, let's work on this feature in this branch.
>
> Regards,
> Jacky
>
> On 2019/08/22 04:58:53, Ajith shetty <[hidden email]> wrote:
> > Hi Community
> >
> > From https://issues.apache.org/jira/browse/SPARK-18127 Spark provides
> SparkSessionExtensions in order to extended capabilities of spark. Carbon
> can use this in order to avoid the tight coupling due to CarbonSession in
> spark environment.
> >
> https://spark.apache.org/docs/2.4.3/api/java/org/apache/spark/sql/SparkSessionExtensions.html
> >
> > Main Scope:
> > 1. Compatible with Spark 2.3.2+
> > 2. Make Carbon Parser Pluggable
> >                 a. Move to antlr4 based parser
> > 3. Make Analyzer Rules Pluggable
> > 4. Make Optimizer Rules Pluggable
> > 5. Make Planning Strategies Pluggable
> >
> > We can have Sub jiras in order to cover all the scenarios due to this.
> Please input your thoughts.
> >
> > Regards
> >
>
--  
Thanks & Regards,
Ravi
 
 
 
  If you reply to this email, your message will be added to the discussion below:
  http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Adapt-to-SparkSessionExtensions-tp83668p83947.html 
  To start a new topic under Apache CarbonData Dev Mailing List archive, email [hidden email]
  To unsubscribe from Apache CarbonData Dev Mailing List archive, click here.
  NAML
Reply | Threaded
Open this post in threaded view
|

Re: Adapt to SparkSessionExtensions

chetdb
In reply to this post by Ajith shetty

Hi Ajith,

1. Now if the config includes spark extensions can we configure the carbon store path and metastoreDB path while creating spark session instance
2. Impact on the existing carbonsession user for spark 2.3.x user mentions that there will be "no profiler and no refresh across driver". Could these points be elaborated in detail ?

Regards
Chetan


On 2019/08/22 04:58:53, Ajith shetty <[hidden email]> wrote:

> Hi Community
>
> From https://issues.apache.org/jira/browse/SPARK-18127 Spark provides SparkSessionExtensions in order to extended capabilities of spark. Carbon can use this in order to avoid the tight coupling due to CarbonSession in spark environment.
> https://spark.apache.org/docs/2.4.3/api/java/org/apache/spark/sql/SparkSessionExtensions.html
>
> Main Scope:
> 1. Compatible with Spark 2.3.2+
> 2. Make Carbon Parser Pluggable
>                 a. Move to antlr4 based parser
> 3. Make Analyzer Rules Pluggable
> 4. Make Optimizer Rules Pluggable
> 5. Make Planning Strategies Pluggable
>
> We can have Sub jiras in order to cover all the scenarios due to this. Please input your thoughts.
>
> Regards
>