[Discussion]Simplify the deployment of carbondata

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[Discussion]Simplify the deployment of carbondata

David CaiQiang
hi all,
 
  I suggest to simplify deployment of CarbonData as following.
  1. remove kettle dependency completely, no need to deploy "carbonplugins" folder on each node, no need to set "carbhon.kettle.home"
  2. remove carbon.properties file from executor side, pass CarbonData configuration to executor side from driver side
  3. use "spark.sql.warehouse.dir"(spark2) or "hive.metastore.warehouse.dir"(spark1) instead of "carbon.storelocation"

  So we will just need to deploy CarbonData jars on cluster mode in the future.

  What's your opinion?

Best Regards
David Cai
Best Regards
David Cai
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion]Simplify the deployment of carbondata

Liang Chen
Administrator
Hi

Thanks you started a good discussion.

For 1 and 2, i agree.  In 1.0.0 version, will support it.
For 3 : Need keep the parameter, users can specify carbon's store location.  If users don't specify the carbon store location, can use the default location what you suggested: "spark.sql.warehouse.dir"(spark2) or "hive.metastore.warehouse.dir"(spark1)

Regards
Liang
QiangCai wrote
hi all,
 
  I suggest to simplify deployment of CarbonData as following.
  1. remove kettle dependency completely, no need to deploy "carbonplugins" folder on each node, no need to set "carbhon.kettle.home"
  2. remove carbon.properties file from executor side, pass CarbonData configuration to executor side from driver side
  3. use "spark.sql.warehouse.dir"(spark2) or "hive.metastore.warehouse.dir"(spark1) instead of "carbon.storelocation"

  So we will just need to deploy CarbonData jars on cluster mode in the future.

  What's your opinion?

Best Regards
David Cai
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion]Simplify the deployment of carbondata

sraghunandan
I suggest we have a parallel implementation by removing kettle and when it
stabilises we make kettle deprecated and then remove it
On Mon, 26 Dec 2016 at 1:25 PM, Liang Chen <[hidden email]> wrote:

> Hi
>
> Thanks you started a good discussion.
>
> For 1 and 2, i agree.  In 1.0.0 version, will support it.
> For 3 : Need keep the parameter, users can specify carbon's store location.
> If users don't specify the carbon store location, can use the default
> location what you suggested: "spark.sql.warehouse.dir"(spark2) or
> "hive.metastore.warehouse.dir"(spark1)
>
> Regards
> Liang
>
> QiangCai wrote
> > hi all,
> >
> >   I suggest to simplify deployment of CarbonData as following.
> >   1. remove kettle dependency completely, no need to deploy
> > "carbonplugins" folder on each node, no need to set "carbhon.kettle.home"
> >   2. remove carbon.properties file from executor side, pass CarbonData
> > configuration to executor side from driver side
> >   3. use "spark.sql.warehouse.dir"(spark2) or
> > "hive.metastore.warehouse.dir"(spark1) instead of "carbon.storelocation"
> >
> >   So we will just need to deploy CarbonData jars on cluster mode in the
> > future.
> >
> >   What's your opinion?
> >
> > Best Regards
> > David Cai
>
>
>
>
>
> --
> View this message in context:
> http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Discussion-Simplify-the-deployment-of-carbondata-tp5000p5006.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive
> at Nabble.com.
>