Login  Register

Re: Carbon over-use cluster resources

Posted by kumarvishal09 on Apr 16, 2020; 4:15pm
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Carbon-over-use-cluster-resources-tp94332p94957.html

Hi Manhua,
In addition to what Ajantha said. All the configuration are exposed to the
user.
And by default no of threads is 2, so in 1 core launching 2 thread is okay.

-Regarda
Kumar Vishal

On Wed, 15 Apr 2020 at 9:55 PM, Ajantha Bhat <[hidden email]> wrote:

> Hi Manhua,
>
> For only No sort and Local sort, we don't follow spark task launch logic.
> we have our own logic of one node one task. And inside that task we can
> control resource by configuration (carbon.number.of.cores.while.loading)
>
> As you pointed in the above mail, *N * C is controlled by configuration*
> and the default value of C is 2.
> *I see over use cluster problem only if you configure it badly.*
>
> Do you have any suggestion to the change design? Feel free to raise a
> discussion and work on it.
>
> Thanks,
> Ajantha
>
> On Tue, Apr 14, 2020 at 6:06 PM Liang Chen <[hidden email]>
> wrote:
>
> > OK, thank you feedbacked this issue, let us look into it.
> >
> > Regards
> > Liang
> >
> >
> > Manhua Jiang wrote
> > > Hi All,
> > > Recently, I found carbon over-use cluster resources. Generally the
> design
> > > of carbon work flow does not act as common spark task which only do one
> > > small work in one thread, but the task has its mind/logic.
> > >
> > > For example,
> > > 1.launch carbon with --num-executors=1 but set
> > > carbon.number.of.cores.while.loading=10;
> > > 2.no_sort table with multi-block input, N Iterator
> > > <CarbonRowBatch>
> > >  for example, carbon will start N tasks in parallel. And in each task
> the
> > > CarbonFactDataHandlerColumnar has model.getNumberOfCores() (let's say
> C)
> > > in ProducerPool. Totally launch N*C threads; ==>This is the case makes
> me
> > > take this as serious problem. To many threads stucks the executor to
> send
> > > heartbeat and be killed.
> > >
> > > So, the over-use is related to usage of threadpool.
> > >
> > > This would affect the cluster overall resource usage and may lead to
> > wrong
> > > performance results.
> > >
> > > I hope this get your notice while fixing or writing new codes.
> >
> >
> >
> >
> >
> > --
> > Sent from:
> > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
> >
>
kumar vishal