Load data into carbondata executors distributed unevenly

Posted by a on
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Load-data-into-carbondata-executors-distributed-unevenly-tp9831.html

Hello!

Test result:
When I load csv data into carbondata table 3 times,the executors distributed unevenly。My  purpose is one node one task,but the result is some node has 2 task and some node has no task。
See the load data 1.png,data 2.png,data 3.png。
The carbondata data.PNG is the data structure in hadoop.

I load 4 0000 0000 records into carbondata table takes 2629s seconds,its too long。

Question:
How can i make the executors distributed evenly ?

The environment
spark2.1+carbondata1.1,there are 7 datanodes.

./bin/spark-shell   \
--master yarn \
--deploy-mode client  \
--num-executors n \ (the first time is 7(result in load data 1.png),the second time is 6(result in load data 2.png),the three time is 8(result in load data3.png))
--executor-cores 10 \
--executor-memory 40G \
--driver-memory 8G \

carbon.properties
######## DataLoading Configuration ########
carbon.sort.file.buffer.size=20
carbon.graph.rowset.size=10000
carbon.number.of.cores.while.loading=10
carbon.sort.size=50000
carbon.number.of.cores.while.compacting=10
carbon.number.of.cores=10

Best regards!