Load data into carbondata executors distributed unevenly

classic Classic list List threaded Threaded
5 messages Options
a
Reply | Threaded
Open this post in threaded view
|

Load data into carbondata executors distributed unevenly

a
Hello!

Test result:
When I load csv data into carbondata table 3 times,the executors distributed unevenly。My  purpose is one node one task,but the result is some node has 2 task and some node has no task。
See the load data 1.png,data 2.png,data 3.png。
The carbondata data.PNG is the data structure in hadoop.

I load 4 0000 0000 records into carbondata table takes 2629s seconds,its too long。

Question:
How can i make the executors distributed evenly ?

The environment
spark2.1+carbondata1.1,there are 7 datanodes.

./bin/spark-shell   \
--master yarn \
--deploy-mode client  \
--num-executors n \ (the first time is 7(result in load data 1.png),the second time is 6(result in load data 2.png),the three time is 8(result in load data3.png))
--executor-cores 10 \
--executor-memory 40G \
--driver-memory 8G \

carbon.properties
######## DataLoading Configuration ########
carbon.sort.file.buffer.size=20
carbon.graph.rowset.size=10000
carbon.number.of.cores.while.loading=10
carbon.sort.size=50000
carbon.number.of.cores.while.compacting=10
carbon.number.of.cores=10

Best regards!





 

Reply | Threaded
Open this post in threaded view
|

Re: Load data into carbondata executors distributed unevenly

ravipesala
Hi,

It seems attachments are missing.Can you attach them again.

Regards,
Ravindra.

On 30 March 2017 at 08:02, a <[hidden email]> wrote:

> Hello!
>
> *Test result:*
> When I load csv data into carbondata table 3 times,the executors
> distributed unevenly。My  purpose
> <http://www.baidu.com/link?url=7rAmHkI2HPR9Hz-TG3467RHKqj_V1KLkZC_mMn3HW6HbyCQR1aDIDSiAZdAZGWEda5ZonK2CFcNh_wXtsSW0YVa_n0NK-dBg3708mv1qeXm> is
> one node one task,but the result is some node has 2 task and some node has
> no task。
> See the load data 1.png,data 2.png,data 3.png。
> The carbondata data.PNG is the data structure in hadoop.
>
> I load 4 0000 0000 records into carbondata table takes 2629s seconds,its
> too long。
>
> *Question:*
> How can i make the executors distributed evenly ?
>
> The environment:
> spark2.1+carbondata1.1,there are 7 datanodes.
>
> *./bin/spark-shell   \--master yarn \--deploy-mode client
>  \--num-executors n \ (the first time is 7(result in load data 1.png),the
> second time is 6(result in load data 2.png),the three time is 8(result in
> load data3.png))--executor-cores 10 \--executor-memory 40G \--driver-memory
> 8G \*
>
> carbon.properties
> ######## DataLoading Configuration ########
> carbon.sort.file.buffer.size=20
> carbon.graph.rowset.size=10000
> carbon.number.of.cores.while.loading=10
> carbon.sort.size=50000
> carbon.number.of.cores.while.compacting=10
> carbon.number.of.cores=10
>
> Best regards!
>
>
>
>
>
>
>



--
Thanks & Regards,
Ravi
a
Reply | Threaded
Open this post in threaded view
|

Re:Re: Load data into carbondata executors distributed unevenly

a
add attachments 

At 2017-03-30 10:38:08, "Ravindra Pesala" <[hidden email]> wrote: >Hi, > >It seems attachments are missing.Can you attach them again. > >Regards, >Ravindra. > >On 30 March 2017 at 08:02, a <[hidden email]> wrote: > >> Hello! >> >> *Test result:* >> When I load csv data into carbondata table 3 times,the executors >> distributed unevenly。My purpose >> <http://www.baidu.com/link?url=7rAmHkI2HPR9Hz-TG3467RHKqj_V1KLkZC_mMn3HW6HbyCQR1aDIDSiAZdAZGWEda5ZonK2CFcNh_wXtsSW0YVa_n0NK-dBg3708mv1qeXm> is >> one node one task,but the result is some node has 2 task and some node has >> no task。 >> See the load data 1.png,data 2.png,data 3.png。 >> The carbondata data.PNG is the data structure in hadoop. >> >> I load 4 0000 0000 records into carbondata table takes 2629s seconds,its >> too long。 >> >> *Question:* >> How can i make the executors distributed evenly ? >> >> The environment: >> spark2.1+carbondata1.1,there are 7 datanodes. >> >> *./bin/spark-shell \--master yarn \--deploy-mode client >> \--num-executors n \ (the first time is 7(result in load data 1.png),the >> second time is 6(result in load data 2.png),the three time is 8(result in >> load data3.png))--executor-cores 10 \--executor-memory 40G \--driver-memory >> 8G \* >> >> carbon.properties >> ######## DataLoading Configuration ######## >> carbon.sort.file.buffer.size=20 >> carbon.graph.rowset.size=10000 >> carbon.number.of.cores.while.loading=10 >> carbon.sort.size=50000 >> carbon.number.of.cores.while.compacting=10 >> carbon.number.of.cores=10 >> >> Best regards! >> >> >> >> >> >> >> > > > >-- >Thanks & Regards, >Ravi
Reply | Threaded
Open this post in threaded view
|

Re:Re: Load data into carbondata executors distributed unevenly

BabuLal
Hi
Please refer below jira id . I guess your issue is same .
CARBONDATA-830
(Data loading scheduling has some issue)

Thanks
Babu
On Mar 30, 2017 12:26, "a" <[hidden email]> wrote:

> add attachments
>
>
> At 2017-03-30 10:38:08, "Ravindra Pesala" <[hidden email]> wrote:
> >Hi, > >It seems attachments are missing.Can you attach them again. >
> >Regards, >Ravindra. > >On 30 March 2017 at 08:02, a <[hidden email]>
> wrote: > >> Hello! >> >> *Test result:* >> When I load csv data into
> carbondata table 3 times,the executors >> distributed unevenly。My purpose
> >> <http://www.baidu.com/link?url=7rAmHkI2HPR9Hz-TG3467RHKqj_V1KLkZC_
> mMn3HW6HbyCQR1aDIDSiAZdAZGWEda5ZonK2CFcNh_wXtsSW0YVa_n0NK-dBg3708mv1qeXm>
> is >> one node one task,but the result is some node has 2 task and some
> node has >> no task。 >> See the load data 1.png,data 2.png,data 3.png。 >>
> The carbondata data.PNG is the data structure in hadoop. >> >> I load 4
> 0000 0000 records into carbondata table takes 2629s seconds,its >> too
> long。 >> >> *Question:* >> How can i make the executors distributed evenly
> ? >> >> The environment: >> spark2.1+carbondata1.1,there are 7 datanodes.
> >> >> *./bin/spark-shell \--master yarn \--deploy-mode client >>
> \--num-executors n \ (the first time is 7(result in load data 1.png),the >>
> second time is 6(result in load data 2.png),the three time is 8(result in
> >> load data3.png))--executor-cores 10 \--executor-memory 40G
> \--driver-memory >> 8G \* >> >> carbon.properties >> ######## DataLoading
> Configuration ######## >> carbon.sort.file.buffer.size=20 >>
> carbon.graph.rowset.size=10000 >> carbon.number.of.cores.while.loading=10
> >> carbon.sort.size=50000 >> carbon.number.of.cores.while.compacting=10
> >> carbon.number.of.cores=10 >> >> Best regards! >> >> >> >> >> >> >> > > >
> >-- >Thanks & Regards, >Ravi
>
>
a
Reply | Threaded
Open this post in threaded view
|

Re:Re:Re: Load data into carbondata executors distributed unevenly

a
Yes ,it is.Babu,thanks for your help!


Best regards!
At 2017-03-30 17:32:07, "babu lal jangir" <[hidden email]> wrote:

>Hi
>Please refer below jira id . I guess your issue is same .
>CARBONDATA-830
>(Data loading scheduling has some issue)
>
>Thanks
>Babu
>On Mar 30, 2017 12:26, "a" <[hidden email]> wrote:
>
>> add attachments
>>
>>
>> At 2017-03-30 10:38:08, "Ravindra Pesala" <[hidden email]> wrote:
>> >Hi, > >It seems attachments are missing.Can you attach them again. >
>> >Regards, >Ravindra. > >On 30 March 2017 at 08:02, a <[hidden email]>
>> wrote: > >> Hello! >> >> *Test result:* >> When I load csv data into
>> carbondata table 3 times,the executors >> distributed unevenly。My purpose
>> >> <http://www.baidu.com/link?url=7rAmHkI2HPR9Hz-TG3467RHKqj_V1KLkZC_
>> mMn3HW6HbyCQR1aDIDSiAZdAZGWEda5ZonK2CFcNh_wXtsSW0YVa_n0NK-dBg3708mv1qeXm>
>> is >> one node one task,but the result is some node has 2 task and some
>> node has >> no task。 >> See the load data 1.png,data 2.png,data 3.png。 >>
>> The carbondata data.PNG is the data structure in hadoop. >> >> I load 4
>> 0000 0000 records into carbondata table takes 2629s seconds,its >> too
>> long。 >> >> *Question:* >> How can i make the executors distributed evenly
>> ? >> >> The environment: >> spark2.1+carbondata1.1,there are 7 datanodes.
>> >> >> *./bin/spark-shell \--master yarn \--deploy-mode client >>
>> \--num-executors n \ (the first time is 7(result in load data 1.png),the >>
>> second time is 6(result in load data 2.png),the three time is 8(result in
>> >> load data3.png))--executor-cores 10 \--executor-memory 40G
>> \--driver-memory >> 8G \* >> >> carbon.properties >> ######## DataLoading
>> Configuration ######## >> carbon.sort.file.buffer.size=20 >>
>> carbon.graph.rowset.size=10000 >> carbon.number.of.cores.while.loading=10
>> >> carbon.sort.size=50000 >> carbon.number.of.cores.while.compacting=10
>> >> carbon.number.of.cores=10 >> >> Best regards! >> >> >> >> >> >> >> > > >
>> >-- >Thanks & Regards, >Ravi
>>
>>