[Discussion] Parallel Insert and Update

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[Discussion] Parallel Insert and Update

Kejian Li
This post was updated on .
Dear community,

This mail is regarding parallel insert and update. Currently we are not
supporting concurrent insert (or load data) and update because it may lead to
data inconsistency or incorrect result.

Now Carbon blocks update operation when insert operation is in progress by
throwing out concurrent operation exception directly. If there is an
executing insert operation that is very time consuming, then update
operation has to wait and sometimes this waiting time is very long.

To come out with this problem, we are planning to support parallel insert
and update. And here I have proposed one of the solutions to implement this
feature.

This is the design document:  Parallel_Insert_and_Update.pdf

Please go through this solution document and provide your input if this
approach is okay or any drawback is there.

Thanks & Regards
Kejian Li
Reply | Threaded
Open this post in threaded view
|

Re: Parallel Insert and Update

Ajantha Bhat
Hi Kejian Li,
Thanks for working on this.

I see that this design and requirement is similar to what Nihal has
discussed a few days ago.
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/DISCUSSION-Parallel-compaction-and-update-td100338.html

So, Probably as Ravidra suggested for Nihal, maybe better to handle this by
segment interface refactoring for this also.

Thanks,
Ajantha

On Wed, Oct 14, 2020 at 2:46 PM Kejian Li <[hidden email]> wrote:

> Dear community,
>
> This mail is regarding parallel insert and update. Currently we are not
> supporting concurrent insert (or load data) and update because it may cause
> data inconsistency or incorrect result.
>
> Now Carbon blocks update operation when insert operation is in progress by
> throwing out concurrent operation exception directly. If there is an
> executing insert operation that is very time consuming, then update
> operation has to wait and sometimes this waiting time is very long.
>
> To come out with this problem, we are planning to support parallel insert
> and update. And here I have proposed one of the solutions to implement this
> feature.
>
> This is the design document:  Parallel_Insert_and_Update.pdf
> <
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/file/t495/Parallel_Insert_and_Update.pdf>
>
>
> Please go through this solution document and provide your input if this
> approach is okay or any drawback is there.
>
> Thanks & Regards
> Kejian Li
>
>
>
>
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>
Reply | Threaded
Open this post in threaded view
|

Re: Parallel Insert and Update

akashrn5
Agree with Ajantha.

Regards
Akash

On Wed, Oct 14, 2020, 3:22 PM Ajantha Bhat <[hidden email]> wrote:

> Hi Kejian Li,
> Thanks for working on this.
>
> I see that this design and requirement is similar to what Nihal has
> discussed a few days ago.
>
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/DISCUSSION-Parallel-compaction-and-update-td100338.html
>
> So, Probably as Ravidra suggested for Nihal, maybe better to handle this by
> segment interface refactoring for this also.
>
> Thanks,
> Ajantha
>
> On Wed, Oct 14, 2020 at 2:46 PM Kejian Li <[hidden email]> wrote:
>
> > Dear community,
> >
> > This mail is regarding parallel insert and update. Currently we are not
> > supporting concurrent insert (or load data) and update because it may
> cause
> > data inconsistency or incorrect result.
> >
> > Now Carbon blocks update operation when insert operation is in progress
> by
> > throwing out concurrent operation exception directly. If there is an
> > executing insert operation that is very time consuming, then update
> > operation has to wait and sometimes this waiting time is very long.
> >
> > To come out with this problem, we are planning to support parallel insert
> > and update. And here I have proposed one of the solutions to implement
> this
> > feature.
> >
> > This is the design document:  Parallel_Insert_and_Update.pdf
> > <
> >
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/file/t495/Parallel_Insert_and_Update.pdf
> >
> >
> >
> > Please go through this solution document and provide your input if this
> > approach is okay or any drawback is there.
> >
> > Thanks & Regards
> > Kejian Li
> >
> >
> >
> >
> > --
> > Sent from:
> > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Parallel Insert and Update

Kejian Li
This post was updated on .
In reply to this post by Ajantha Bhat
Reply | Threaded
Open this post in threaded view
|

Re: Parallel Insert and Update

Kejian Li
This post was updated on .
In reply to this post by akashrn5