Login  Register

Re: Carbon over-use cluster resources

Posted by Liang Chen on Apr 14, 2020; 12:36pm
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Carbon-over-use-cluster-resources-tp94332p94875.html

OK, thank you feedbacked this issue, let us look into it.

Regards
Liang


Manhua Jiang wrote

> Hi All,
> Recently, I found carbon over-use cluster resources. Generally the design
> of carbon work flow does not act as common spark task which only do one
> small work in one thread, but the task has its mind/logic.
>
> For example,
> 1.launch carbon with --num-executors=1 but set
> carbon.number.of.cores.while.loading=10;
> 2.no_sort table with multi-block input, N Iterator
> <CarbonRowBatch>
>  for example, carbon will start N tasks in parallel. And in each task the
> CarbonFactDataHandlerColumnar has model.getNumberOfCores() (let's say C)
> in ProducerPool. Totally launch N*C threads; ==>This is the case makes me
> take this as serious problem. To many threads stucks the executor to send
> heartbeat and be killed.
>
> So, the over-use is related to usage of threadpool.
>
> This would affect the cluster overall resource usage and may lead to wrong
> performance results.
>
> I hope this get your notice while fixing or writing new codes.





--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/