Re: Carbon over-use cluster resources
Posted by
Liang Chen on
Apr 14, 2020; 12:36pm
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Carbon-over-use-cluster-resources-tp94332p94875.html
OK, thank you feedbacked this issue, let us look into it.
Regards
Liang
Manhua Jiang wrote
> Hi All,
> Recently, I found carbon over-use cluster resources. Generally the design
> of carbon work flow does not act as common spark task which only do one
> small work in one thread, but the task has its mind/logic.
>
> For example,
> 1.launch carbon with --num-executors=1 but set
> carbon.number.of.cores.while.loading=10;
> 2.no_sort table with multi-block input, N Iterator
> <CarbonRowBatch>
> for example, carbon will start N tasks in parallel. And in each task the
> CarbonFactDataHandlerColumnar has model.getNumberOfCores() (let's say C)
> in ProducerPool. Totally launch N*C threads; ==>This is the case makes me
> take this as serious problem. To many threads stucks the executor to send
> heartbeat and be killed.
>
> So, the over-use is related to usage of threadpool.
>
> This would affect the cluster overall resource usage and may lead to wrong
> performance results.
>
> I hope this get your notice while fixing or writing new codes.
--
Sent from:
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/