[DISCUSSION] Optimize the default value for some parameters

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSSION] Optimize the default value for some parameters

Liang Chen
Administrator
Hi All

As you know, some default value of parameters need to adjust for most of
cases, this discussion is for collecting which parameters' default value
need to be optimized:

1. TABLE_BLOCKSIZE:
current default is 1G, propose to adjust to 512M

2.
Please append at here if you propose to adjust which parameters' default
value .

Regards
Liang



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Optimize the default value for some parameters

xm_zzc
Hi:
  If we are using carbondata + spark to load data, we can set
carbon.number.of.cores.while.loading to the  number of executor cores.

  When set the number of executor cores to 6, it shows that there are at
least 6 cores per node for loading data, so we can set
carbon.number.of.cores.while.loading to 6 automatically.



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Optimize the default value for some parameters

ravipesala
Hi,

Yes, it is a good suggestion we can plan to set the number of loading cores
dynamically as per the available executor cores. Can you please raise
jira for it.

Regards,
Ravindra

On 25 October 2017 at 12:08, xm_zzc <[hidden email]> wrote:

> Hi:
>   If we are using carbondata + spark to load data, we can set
> carbon.number.of.cores.while.loading to the  number of executor cores.
>
>   When set the number of executor cores to 6, it shows that there are at
> least 6 cores per node for loading data, so we can set
> carbon.number.of.cores.while.loading to 6 automatically.
>
>
>
> --
> Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.
> n5.nabble.com/
>



--
Thanks & Regards,
Ravi
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Optimize the default value for some parameters

ravipesala
In reply to this post by Liang Chen
Hi Liang,

Now the TABLE_BLOCKSIZE is only limited to the size of carbondata file. It
is not considered for allocating tasks. So it does not matter the size
of TABLE_BLOCKSIZE.
But yes we can consider it as 512M.

We can also change the default of blocklet
(carbon.blockletgroup.size.in.mb) size to 128MB. Currently, it is only
64MB. Since the number of tasks allocation is derived from blocklet it is
better to increase the blocklet size. And also we should add a table level
property for blocklet size to configure while creating a table.

Regards,
Ravindra.

On 11 October 2017 at 13:36, Liang Chen <[hidden email]> wrote:

> Hi All
>
> As you know, some default value of parameters need to adjust for most of
> cases, this discussion is for collecting which parameters' default value
> need to be optimized:
>
> 1. TABLE_BLOCKSIZE:
> current default is 1G, propose to adjust to 512M
>
> 2.
> Please append at here if you propose to adjust which parameters' default
> value .
>
> Regards
> Liang
>
>
>
> --
> Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.
> n5.nabble.com/
>



--
Thanks & Regards,
Ravi
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Optimize the default value for some parameters

xm_zzc
In reply to this post by ravipesala
Hi ravipesala:
  Ok, I will raise jira for this and try to implement this.



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Optimize the default value for some parameters

Liang Chen
Administrator
In reply to this post by ravipesala
Hi Ravi

Yes, so we need provide a table level property for blocklet size to
configure while creating a table. can you please create one JIRA for this ?

How about liking this:
CREATE TABLE IF NOT EXISTS XXXX(column_name,column_type) STORED BY
'carbondata' TBLPROPERTIES('TABLE_BLOCKLETSIZE'='128')

Regards
Liang

ravipesala wrote

> Hi Liang,
>
> Now the TABLE_BLOCKSIZE is only limited to the size of carbondata file. It
> is not considered for allocating tasks. So it does not matter the size
> of TABLE_BLOCKSIZE.
> But yes we can consider it as 512M.
>
> We can also change the default of blocklet
> (carbon.blockletgroup.size.in.mb) size to 128MB. Currently, it is only
> 64MB. Since the number of tasks allocation is derived from blocklet it is
> better to increase the blocklet size. And also we should add a table level
> property for blocklet size to configure while creating a table.
>
> Regards,
> Ravindra.
>
> On 11 October 2017 at 13:36, Liang Chen &lt;

> chenliang6136@

> &gt; wrote:
>
>> Hi All
>>
>> As you know, some default value of parameters need to adjust for most of
>> cases, this discussion is for collecting which parameters' default value
>> need to be optimized:
>>
>> 1. TABLE_BLOCKSIZE:
>> current default is 1G, propose to adjust to 512M
>>
>> 2.
>> Please append at here if you propose to adjust which parameters' default
>> value .
>>
>> Regards
>> Liang
>>
>>
>>
>> --
>> Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.
>> n5.nabble.com/
>>
>
>
>
> --
> Thanks & Regards,
> Ravi





--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/