Login  Register

Re: [Feature ]Design Document for Update/Delete support in CarbonData

Posted by manishgupta88 on Nov 21, 2016; 12:00pm
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Feature-Design-Document-for-Update-Delete-support-in-CarbonData-tp3043p3063.html

Hi Aniket,

I think in RowID format we should also include partitionID. Currently
carbon is not supporting partition but going forward when we support
partitioning, this format would comply with it.

 [<Partition ID><Segment ID><Block ID><Blocklet ID><Offset in Blocklet>]

Regards
Manish Gupta

On Mon, Nov 21, 2016 at 1:07 PM, Aniket Adnaik <[hidden email]>
wrote:

> Hi Sujith,
>
> Please see my comments inline.
>
> Best Regards,
> Aniket
>
> On Sun, Nov 20, 2016 at 9:11 PM, sujith chacko <
> [hidden email]>
> wrote:
>
> > Hi Aniket,
> >
> >       Its a well documented design,  just want to know few points like
> >
> > a.  Format of the RowID and its datatype
> >
>  AA>> Following format can be used to represent a unique rowed;
>
>  [<Segment ID><Block ID><Blocklet ID><Offset in Blocklet>]
>  A simple way would be to use String data type and store it as a text file.
> However, more efficient way could be to use Bitsets/Bitmaps as further
> optimization. Compressed Bitmaps such as Roaring bitmaps can be used for
> better performance and efficient storage.
>
> b.  Impact of this feature in select query since every time query process
> has to exclude each deleted records and include corresponding updated
> record, any optimization is considered in tackling the query performance
> issue since one of the major highlights of carbon is performance.
> AA>> Some of the optimizations would be  to cache the deltas to avoid
> recurrent I/O,
> to store sorted rowids in delete delta for efficient lookup, and perform
> regular compaction to minimize the impact on select query performance.
> Additionally, we may have to explore ways to perform compaction
> automatically, for example, if more than 25% of rows are read from deltas.
> Please feel free to share if you have any ideas or suggestions.
>
> Thanks,
> Sujith
>
> On Nov 20, 2016 9:24 PM, "Aniket Adnaik" <[hidden email]> wrote:
>
> > Hi All,
> >
> > Please find a design doc for Update/Delete support in CarbonData.
> >
> > https://drive.google.com/file/d/0B71_EuXTdDi8S2dxVjN6Z1RhWlU/view?
> > usp=sharing
> >
> > Best Regards,
> > Aniket
> >
>