Apache CarbonData Dev Mailing List archive

Carbon over-use cluster resources

Classic

List

Threaded

7 messages Options

Manhua Jiang

Carbon over-use cluster resources

Hi All,
Recently, I found carbon over-use cluster resources. Generally the design of carbon work flow does not act as common spark task which only do one small work in one thread, but the task has its mind/logic.

For example,
1.launch carbon with --num-executors=1 but set carbon.number.of.cores.while.loading=10;
2.no_sort table with multi-block input, N Iterator<CarbonRowBatch> for example, carbon will start N tasks in parallel. And in each task the CarbonFactDataHandlerColumnar has model.getNumberOfCores() (let's say C) in ProducerPool. Totally launch N*C threads; ==>This is the case makes me take this as serious problem. To many threads stucks the executor to send heartbeat and be killed.

So, the over-use is related to usage of threadpool.

This would affect the cluster overall resource usage and may lead to wrong performance results.

I hope this get your notice while fixing or writing new codes.

Liang Chen

Re: Carbon over-use cluster resources

Administrator

OK, thank you feedbacked this issue, let us look into it.

Regards
Liang

Manhua Jiang wrote

> Hi All,
> Recently, I found carbon over-use cluster resources. Generally the design
> of carbon work flow does not act as common spark task which only do one
> small work in one thread, but the task has its mind/logic.
>
> For example,
> 1.launch carbon with --num-executors=1 but set
> carbon.number.of.cores.while.loading=10;
> 2.no_sort table with multi-block input, N Iterator
> <CarbonRowBatch>
> for example, carbon will start N tasks in parallel. And in each task the
> CarbonFactDataHandlerColumnar has model.getNumberOfCores() (let's say C)
> in ProducerPool. Totally launch N*C threads; ==>This is the case makes me
> take this as serious problem. To many threads stucks the executor to send
> heartbeat and be killed.
>
> So, the over-use is related to usage of threadpool.
>
> This would affect the cluster overall resource usage and may lead to wrong
> performance results.
>
> I hope this get your notice while fixing or writing new codes.

--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Ajantha Bhat

Re: Carbon over-use cluster resources

Hi Manhua,

For only No sort and Local sort, we don't follow spark task launch logic.
we have our own logic of one node one task. And inside that task we can
control resource by configuration (carbon.number.of.cores.while.loading)

As you pointed in the above mail, *N * C is controlled by configuration*
and the default value of C is 2.
*I see over use cluster problem only if you configure it badly.*

Do you have any suggestion to the change design? Feel free to raise a
discussion and work on it.

Thanks,
Ajantha

On Tue, Apr 14, 2020 at 6:06 PM Liang Chen <[hidden email]> wrote:

> OK, thank you feedbacked this issue, let us look into it.
>
> Regards
> Liang
>
>
> Manhua Jiang wrote
> > Hi All,
> > Recently, I found carbon over-use cluster resources. Generally the design
> > of carbon work flow does not act as common spark task which only do one
> > small work in one thread, but the task has its mind/logic.
> >
> > For example,
> > 1.launch carbon with --num-executors=1 but set
> > carbon.number.of.cores.while.loading=10;
> > 2.no_sort table with multi-block input, N Iterator
> > <CarbonRowBatch>
> > for example, carbon will start N tasks in parallel. And in each task the
> > CarbonFactDataHandlerColumnar has model.getNumberOfCores() (let's say C)
> > in ProducerPool. Totally launch N*C threads; ==>This is the case makes me
> > take this as serious problem. To many threads stucks the executor to send
> > heartbeat and be killed.
> >
> > So, the over-use is related to usage of threadpool.
> >
> > This would affect the cluster overall resource usage and may lead to
> wrong
> > performance results.
> >
> > I hope this get your notice while fixing or writing new codes.
>
>
>
>
>
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>

kumarvishal09

Re: Carbon over-use cluster resources

Hi Manhua,
In addition to what Ajantha said. All the configuration are exposed to the
user.
And by default no of threads is 2, so in 1 core launching 2 thread is okay.

-Regarda
Kumar Vishal

On Wed, 15 Apr 2020 at 9:55 PM, Ajantha Bhat <[hidden email]> wrote:

> Hi Manhua,
>
> For only No sort and Local sort, we don't follow spark task launch logic.
> we have our own logic of one node one task. And inside that task we can
> control resource by configuration (carbon.number.of.cores.while.loading)
>
> As you pointed in the above mail, *N * C is controlled by configuration*
> and the default value of C is 2.
> *I see over use cluster problem only if you configure it badly.*
>
> Do you have any suggestion to the change design? Feel free to raise a
> discussion and work on it.
>
> Thanks,
> Ajantha
>
> On Tue, Apr 14, 2020 at 6:06 PM Liang Chen <[hidden email]>
> wrote:
>
> > OK, thank you feedbacked this issue, let us look into it.
> >
> > Regards
> > Liang
> >
> >
> > Manhua Jiang wrote
> > > Hi All,
> > > Recently, I found carbon over-use cluster resources. Generally the
> design
> > > of carbon work flow does not act as common spark task which only do one
> > > small work in one thread, but the task has its mind/logic.
> > >
> > > For example,
> > > 1.launch carbon with --num-executors=1 but set
> > > carbon.number.of.cores.while.loading=10;
> > > 2.no_sort table with multi-block input, N Iterator
> > > <CarbonRowBatch>
> > > for example, carbon will start N tasks in parallel. And in each task
> the
> > > CarbonFactDataHandlerColumnar has model.getNumberOfCores() (let's say
> C)
> > > in ProducerPool. Totally launch N*C threads; ==>This is the case makes
> me
> > > take this as serious problem. To many threads stucks the executor to
> send
> > > heartbeat and be killed.
> > >
> > > So, the over-use is related to usage of threadpool.
> > >
> > > This would affect the cluster overall resource usage and may lead to
> > wrong
> > > performance results.
> > >
> > > I hope this get your notice while fixing or writing new codes.
> >
> >
> >
> >
> >
> > --
> > Sent from:
> > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
> >
>

kumar vishal

Manhua Jiang

Re: Carbon over-use cluster resources

Hi Vishal,
what you said "1 core launching 2 thread" could be the view from system level, right?
In yarn mode, what application got is vCore, so carbon should not take that as a physical core.

On 2020/04/16 16:15:23, Kumar Vishal <[hidden email]> wrote:

> Hi Manhua,
> In addition to what Ajantha said. All the configuration are exposed to the
> user.
> And by default no of threads is 2, so in 1 core launching 2 thread is okay.
>
> -Regarda
> Kumar Vishal
>
> On Wed, 15 Apr 2020 at 9:55 PM, Ajantha Bhat <[hidden email]> wrote:
>
> > Hi Manhua,
> >
> > For only No sort and Local sort, we don't follow spark task launch logic.
> > we have our own logic of one node one task. And inside that task we can
> > control resource by configuration (carbon.number.of.cores.while.loading)
> >
> > As you pointed in the above mail, *N * C is controlled by configuration*
> > and the default value of C is 2.
> > *I see over use cluster problem only if you configure it badly.*
> >
> > Do you have any suggestion to the change design? Feel free to raise a
> > discussion and work on it.
> >
> > Thanks,
> > Ajantha
> >
> > On Tue, Apr 14, 2020 at 6:06 PM Liang Chen <[hidden email]>
> > wrote:
> >
> > > OK, thank you feedbacked this issue, let us look into it.
> > >
> > > Regards
> > > Liang
> > >
> > >
> > > Manhua Jiang wrote
> > > > Hi All,
> > > > Recently, I found carbon over-use cluster resources. Generally the
> > design
> > > > of carbon work flow does not act as common spark task which only do one
> > > > small work in one thread, but the task has its mind/logic.
> > > >
> > > > For example,
> > > > 1.launch carbon with --num-executors=1 but set
> > > > carbon.number.of.cores.while.loading=10;
> > > > 2.no_sort table with multi-block input, N Iterator
> > > > <CarbonRowBatch>
> > > > for example, carbon will start N tasks in parallel. And in each task
> > the
> > > > CarbonFactDataHandlerColumnar has model.getNumberOfCores() (let's say
> > C)
> > > > in ProducerPool. Totally launch N*C threads; ==>This is the case makes
> > me
> > > > take this as serious problem. To many threads stucks the executor to
> > send
> > > > heartbeat and be killed.
> > > >
> > > > So, the over-use is related to usage of threadpool.
> > > >
> > > > This would affect the cluster overall resource usage and may lead to
> > > wrong
> > > > performance results.
> > > >
> > > > I hope this get your notice while fixing or writing new codes.
> > >
> > >
> > >
> > >
> > >
> > > --
> > > Sent from:
> > > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
> > >
> >
>

Manhua Jiang

Re: Carbon over-use cluster resources

In reply to this post by Ajantha Bhat

Hi Ajantha,
If we think of this problem in the opposite, carbon may waste resources if user do not set the properties correctly.

What about the case when concurrent loading?

So first of all, we need to figure out where and how many the executor services is used. If keeping logic of one node one task, need to keep the overall running threads in a task.

Then, a little thinking:
Is a global executor service possible? That may cause some dependencies in different steps of loading.
Is multiple executor services for each step(or others) of loading possible? Can the specific executor services change size? (like local-sort is done, then most threads work for writing and none for input reading and converting)

BTW, do you know why the cofigurtation "carbon.number.of.cores.while.loading" born ?

On 2020/04/15 13:54:50, Ajantha Bhat <[hidden email]> wrote:

> Hi Manhua,
>
> For only No sort and Local sort, we don't follow spark task launch logic.
> we have our own logic of one node one task. And inside that task we can
> control resource by configuration (carbon.number.of.cores.while.loading)
>
> As you pointed in the above mail, *N * C is controlled by configuration*
> and the default value of C is 2.
> *I see over use cluster problem only if you configure it badly.*
>
> Do you have any suggestion to the change design? Feel free to raise a
> discussion and work on it.
>
> Thanks,
> Ajantha
>
> On Tue, Apr 14, 2020 at 6:06 PM Liang Chen <[hidden email]> wrote:
>
> > OK, thank you feedbacked this issue, let us look into it.
> >
> > Regards
> > Liang
> >
> >
> > Manhua Jiang wrote
> > > Hi All,
> > > Recently, I found carbon over-use cluster resources. Generally the design
> > > of carbon work flow does not act as common spark task which only do one
> > > small work in one thread, but the task has its mind/logic.
> > >
> > > For example,
> > > 1.launch carbon with --num-executors=1 but set
> > > carbon.number.of.cores.while.loading=10;
> > > 2.no_sort table with multi-block input, N Iterator
> > > <CarbonRowBatch>
> > > for example, carbon will start N tasks in parallel. And in each task the
> > > CarbonFactDataHandlerColumnar has model.getNumberOfCores() (let's say C)
> > > in ProducerPool. Totally launch N*C threads; ==>This is the case makes me
> > > take this as serious problem. To many threads stucks the executor to send
> > > heartbeat and be killed.
> > >
> > > So, the over-use is related to usage of threadpool.
> > >
> > > This would affect the cluster overall resource usage and may lead to
> > wrong
> > > performance results.
> > >
> > > I hope this get your notice while fixing or writing new codes.
> >
> >
> >
> >
> >
> > --
> > Sent from:
> > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
> >
>

David CaiQiang

Re: Carbon over-use cluster resources

In reply to this post by Manhua Jiang

Hi, manhua
Now no_sort reuse the loading flow of local_sort. It is not a good
solution and led to the situation which you have mentioned. In my opinion,
we need to adjust the loading flow of no_sort, maybe like global_sort
finally.

In addition, the producer-consumer pattern in data encoding and
compression also can be optimized for no_sort and global_sort, maybe just
prefetch one page and process it instead of using a thread pool.

-----
Best Regards
David Cai
--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Best Regards
David Cai