Login  Register

Re: Carbon over-use cluster resources

Posted by Ajantha Bhat on Apr 15, 2020; 1:54pm
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Carbon-over-use-cluster-resources-tp94332p94924.html

Hi Manhua,

For only No sort and Local sort, we don't follow spark task launch logic.
we have our own logic of one node one task. And inside that task we can
control resource by configuration (carbon.number.of.cores.while.loading)

As you pointed in the above mail, *N * C is controlled by configuration*
and the default value of C is 2.
*I see over use cluster problem only if you configure it badly.*

Do you have any suggestion to the change design? Feel free to raise a
discussion and work on it.

Thanks,
Ajantha

On Tue, Apr 14, 2020 at 6:06 PM Liang Chen <[hidden email]> wrote:

> OK, thank you feedbacked this issue, let us look into it.
>
> Regards
> Liang
>
>
> Manhua Jiang wrote
> > Hi All,
> > Recently, I found carbon over-use cluster resources. Generally the design
> > of carbon work flow does not act as common spark task which only do one
> > small work in one thread, but the task has its mind/logic.
> >
> > For example,
> > 1.launch carbon with --num-executors=1 but set
> > carbon.number.of.cores.while.loading=10;
> > 2.no_sort table with multi-block input, N Iterator
> > <CarbonRowBatch>
> >  for example, carbon will start N tasks in parallel. And in each task the
> > CarbonFactDataHandlerColumnar has model.getNumberOfCores() (let's say C)
> > in ProducerPool. Totally launch N*C threads; ==>This is the case makes me
> > take this as serious problem. To many threads stucks the executor to send
> > heartbeat and be killed.
> >
> > So, the over-use is related to usage of threadpool.
> >
> > This would affect the cluster overall resource usage and may lead to
> wrong
> > performance results.
> >
> > I hope this get your notice while fixing or writing new codes.
>
>
>
>
>
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>