http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Feature-Design-Document-for-Update-Delete-support-in-CarbonData-tp3043p3129.html
I have few queries regarding regarding the 1st suggestion.
1. Dimensions can both be dictionary and no dictionary. If we update the
columns and 1 for no dictionary columns. Will that be ok?
2. We write dictionary files in append mode. Updating dictionary files will
dictionary file.
> Hi Aniket,
>
> The design looks sound and the documentation is great.
> I have few suggestions.
>
> 1) Measure update vs dimension update : In case of dimension update. for
> example user wants to change dept1 to dept2 for all users who are under
> dept1. Can we just update the dictionary for faster performance?
> 2) Update Semantics (one matching record vs multiple matching record): I
> could not understand this section. Wanted to confirm if we will support one
> update statement updating multiple rows.
>
> -Vimal
>
> On Tue, Nov 22, 2016 at 2:30 PM, Liang Chen <
[hidden email]>
> wrote:
>
> > Hi Aniket
> >
> > Thanks you finished the good design documents. A couple of inputs from my
> > side:
> >
> > 1.Please add the below mentioned info(Rowid definition etc.) to design
> > documents also.
> > 2.In page6 :"Schema change operation can run in parallel with Update or
> > Delte operations, but not with another schema change operation" , can you
> > explain this item ?
> > 3.Please unify the description: use "CarbonData" to replace "Carbon",
> > unify the description for "destination table" and "target table".
> > 4.The Update operation's delete delta is same with Delete operation's
> > delete
> > delta?
> >
> > BTW, it would be much better if you could provide google docs for review
> in
> > the next time, it is really difficult to give comment based on pdf
> > documents
> > :)
> >
> > Regards
> > Liang
> >
> > Aniket Adnaik wrote
> > > Hi Sujith,
> > >
> > > Please see my comments inline.
> > >
> > > Best Regards,
> > > Aniket
> > >
> > > On Sun, Nov 20, 2016 at 9:11 PM, sujith chacko <
> >
> > > sujithchacko.2010@
> >
> > > >
> > > wrote:
> > >
> > >> Hi Aniket,
> > >>
> > >> Its a well documented design, just want to know few points like
> > >>
> > >> a. Format of the RowID and its datatype
> > >>
> > > AA>> Following format can be used to represent a unique rowed;
> > >
> > > [
> > > <Segment ID>
> > > <Block ID>
> > > <Blocklet ID>
> > > <Offset in Blocklet>
> > > ]
> > > A simple way would be to use String data type and store it as a text
> > > file.
> > > However, more efficient way could be to use Bitsets/Bitmaps as further
> > > optimization. Compressed Bitmaps such as Roaring bitmaps can be used
> for
> > > better performance and efficient storage.
> > >
> > > b. Impact of this feature in select query since every time query
> process
> > > has to exclude each deleted records and include corresponding updated
> > > record, any optimization is considered in tackling the query
> performance
> > > issue since one of the major highlights of carbon is performance.
> > > AA>> Some of the optimizations would be to cache the deltas to avoid
> > > recurrent I/O,
> > > to store sorted rowids in delete delta for efficient lookup, and
> perform
> > > regular compaction to minimize the impact on select query performance.
> > > Additionally, we may have to explore ways to perform compaction
> > > automatically, for example, if more than 25% of rows are read from
> > deltas.
> > > Please feel free to share if you have any ideas or suggestions.
> > >
> > > Thanks,
> > > Sujith
> > >
> > > On Nov 20, 2016 9:24 PM, "Aniket Adnaik" <
> >
> > > aniket.adnaik@
> >
> > > > wrote:
> > >
> > >> Hi All,
> > >>
> > >> Please find a design doc for Update/Delete support in CarbonData.
> > >>
> > >>
https://drive.google.com/file/d/0B71_EuXTdDi8S2dxVjN6Z1RhWlU/view?
> > >> usp=sharing
> > >>
> > >> Best Regards,
> > >> Aniket
> > >>
> >
> >
> >
> >
> >
> > --
> > View this message in context:
http://apache-carbondata-> > mailing-list-archive.1130556.n5.nabble.com/Feature-Design-
> > Document-for-Update-Delete-support-in-CarbonData-tp3043p3093.html
> > Sent from the Apache CarbonData Mailing List archive mailing list archive
> > at Nabble.com.
> >
>