http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Feature-Design-Document-for-Update-Delete-support-in-CarbonData-tp3043p3116.html
The design looks sound and the documentation is great.
I have few suggestions.
1) Measure update vs dimension update : In case of dimension update. for
dept1. Can we just update the dictionary for faster performance?
could not understand this section. Wanted to confirm if we will support one
update statement updating multiple rows.
> Hi Aniket
>
> Thanks you finished the good design documents. A couple of inputs from my
> side:
>
> 1.Please add the below mentioned info(Rowid definition etc.) to design
> documents also.
> 2.In page6 :"Schema change operation can run in parallel with Update or
> Delte operations, but not with another schema change operation" , can you
> explain this item ?
> 3.Please unify the description: use "CarbonData" to replace "Carbon",
> unify the description for "destination table" and "target table".
> 4.The Update operation's delete delta is same with Delete operation's
> delete
> delta?
>
> BTW, it would be much better if you could provide google docs for review in
> the next time, it is really difficult to give comment based on pdf
> documents
> :)
>
> Regards
> Liang
>
> Aniket Adnaik wrote
> > Hi Sujith,
> >
> > Please see my comments inline.
> >
> > Best Regards,
> > Aniket
> >
> > On Sun, Nov 20, 2016 at 9:11 PM, sujith chacko <
>
> > sujithchacko.2010@
>
> > >
> > wrote:
> >
> >> Hi Aniket,
> >>
> >> Its a well documented design, just want to know few points like
> >>
> >> a. Format of the RowID and its datatype
> >>
> > AA>> Following format can be used to represent a unique rowed;
> >
> > [
> > <Segment ID>
> > <Block ID>
> > <Blocklet ID>
> > <Offset in Blocklet>
> > ]
> > A simple way would be to use String data type and store it as a text
> > file.
> > However, more efficient way could be to use Bitsets/Bitmaps as further
> > optimization. Compressed Bitmaps such as Roaring bitmaps can be used for
> > better performance and efficient storage.
> >
> > b. Impact of this feature in select query since every time query process
> > has to exclude each deleted records and include corresponding updated
> > record, any optimization is considered in tackling the query performance
> > issue since one of the major highlights of carbon is performance.
> > AA>> Some of the optimizations would be to cache the deltas to avoid
> > recurrent I/O,
> > to store sorted rowids in delete delta for efficient lookup, and perform
> > regular compaction to minimize the impact on select query performance.
> > Additionally, we may have to explore ways to perform compaction
> > automatically, for example, if more than 25% of rows are read from
> deltas.
> > Please feel free to share if you have any ideas or suggestions.
> >
> > Thanks,
> > Sujith
> >
> > On Nov 20, 2016 9:24 PM, "Aniket Adnaik" <
>
> > aniket.adnaik@
>
> > > wrote:
> >
> >> Hi All,
> >>
> >> Please find a design doc for Update/Delete support in CarbonData.
> >>
> >>
https://drive.google.com/file/d/0B71_EuXTdDi8S2dxVjN6Z1RhWlU/view?
> >> usp=sharing
> >>
> >> Best Regards,
> >> Aniket
> >>
>
>
>
>
>
> --
> View this message in context:
http://apache-carbondata-> mailing-list-archive.1130556.n5.nabble.com/Feature-Design-
> Document-for-Update-Delete-support-in-CarbonData-tp3043p3093.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive
> at Nabble.com.
>