Apache CarbonData Dev Mailing List archive - Re:Re: Load data into carbondata executors distributed unevenly

Apache CarbonData Dev Mailing List archive

Re:Re: Load data into carbondata executors distributed unevenly

Posted by a on
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Load-data-into-carbondata-executors-distributed-unevenly-tp9831p9851.html

add attachments 

At 2017-03-30 10:38:08, "Ravindra Pesala" <[hidden email]> wrote:
>Hi,
>
>It seems attachments are missing.Can you attach them again.
>
>Regards,
>Ravindra.
>
>On 30 March 2017 at 08:02, a <[hidden email]> wrote:
>
>> Hello!
>>
>> *Test result：*
>> When I load csv data into carbondata table 3 times，the executors
>> distributed unevenly。My  purpose
>> <http://www.baidu.com/link?url=7rAmHkI2HPR9Hz-TG3467RHKqj_V1KLkZC_mMn3HW6HbyCQR1aDIDSiAZdAZGWEda5ZonK2CFcNh_wXtsSW0YVa_n0NK-dBg3708mv1qeXm> is
>> one node one task，but the result is some node has 2 task and some node has
>> no task。
>> See the load data 1.png,data 2.png,data 3.png。
>> The carbondata data.PNG is the data structure in hadoop.
>>
>> I load 4 0000 0000 records into carbondata table takes 2629s seconds，its
>> too long。
>>
>> *Question：*
>> How can i make the executors distributed evenly ?
>>
>> The environment：
>> spark2.1+carbondata1.1，there are 7 datanodes.
>>
>> *./bin/spark-shell   \--master yarn \--deploy-mode client
>>  \--num-executors n \ （the first time is 7(result in load data 1.png)，the
>> second time is 6(result in load data 2.png),the three time is 8(result in
>> load data3.png)）--executor-cores 10 \--executor-memory 40G \--driver-memory
>> 8G \*
>>
>> carbon.properties
>> ######## DataLoading Configuration ########
>> carbon.sort.file.buffer.size=20
>> carbon.graph.rowset.size=10000
>> carbon.number.of.cores.while.loading=10
>> carbon.sort.size=50000
>> carbon.number.of.cores.while.compacting=10
>> carbon.number.of.cores=10
>>
>> Best regards!
>>
>>
>>
>>
>>
>>
>>
>
>
>
>-- 
>Thanks & Regards,
>Ravi