http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Discussion-Minimize-the-Btree-size-and-unify-the-driver-and-executor-Btrees-tp12771p13004.html
number of node is 10 then 10 index file per segment). For merging all the
maintain block's footer to task id mapping.
>
> > 在 2017年5月19日,下午8:28,Ravindra Pesala <
[hidden email]> 写道:
> >
> > Hi Vishal & Jacky,
> >
> > Yes, we need to extract interfaces for storing index in memory like
> > onheap/offheap. And same interface can be implemented to store/retrieve
> > index from external service like DB.
> >
> > Not all segments will be kept in single tree here, we try to keep one
> index
> > file to one array. Here one segment may contain multiple indexes so we
> will
> > have multiple arrays per segment.
>
> One more suggest here, we should make number of index file as less as
> possible so that improve IO to load the index, like make one index file for
> one segment only. We can do it by issuing one more mapreduce job after the
> loading carbondata file. The job will read all footers and reduce to
> generate one index file. What do you think?
>
> >
> > Yes, the ratio of blocklets inside blocks are small, so for each task we
> > will send the information of blocklets which needs to scanned inside
> block
> > so the serialization cost is very minimum.
> >
> > Regards,
> > Ravindra
> >
> > On 19 May 2017 at 08:17, Jacky Li <
[hidden email]> wrote:
> >
> >>
> >>> 在 2017年5月18日,下午3:31,Kumar Vishal <
[hidden email]> 写道:
> >>>
> >>> Hi Ravi,
> >>>
> >>> I have few queries related to both the solution.
> >>>
> >>> 1. To reduce the java heap for Btree , we can remove the Btree data
> >>> structure and use simple single array and do binary search on it. And
> >> also
> >>> we should move this cached arrays to unsafe (offheap/onheap) to reduce
> >> the
> >>> burden on GC.
> >>>
> >>
> >> I think now is the right time to abstract driver side index into an
> >> interface, which can have multiple implementation: onheap, offheap,
> >> external service, etc.
> >>
> >>> Single array of object?? If user is loading small data in each load so
> we
> >>> will save only intermediate node of btree by maintaining in array. Yes
> by
> >>> maintaining all the metadata(startkey, min,max) in offheap will reduce
> >>> memory footprint. Is it possible to maintain data in single byte array
> >> and
> >>> maintaining the offset of each key for each column??
> >>>
> >>
> >> Do you mean put to all indexes for all segments in a single array? In
> that
> >> case, we can compare in-memory sequential scan and binary search which
> >> introduce random walk on memory. I feel sequential scan may beat random
> >> walk if the number of element is not many.
> >>
> >>> 2. Unify the btree to single Btree instead of 2 and load at driver
> side.
> >>> So that only one lookup can be done to find the blocklets directly. And
> >>> executors are not required to load the btree for every query.
> >>>
> >>> If we unify the btree to single btree, then it will be loaded in driver
> >>> side, so driver will need more memory to maintain this metadata and for
> >>> each query driver needs to do blocklet level pruning, in case of
> >> concurrent
> >>> query driver may be overloaded. And how we will send this metadata
> >>> information (offset of each column in file and other metadata) to
> >> executor,
> >>> serializing and deserializing this metadata cost also we need to
> >> consider.
> >>>
> >>
> >> Assuming there is only 4~8 blocklet in block, I think for a block, we
> can
> >> sent blockletId using a byte array. Should be ok, right?
> >>
> >>> Please correct me if my comments are not valid.
> >>>
> >>> -Regards
> >>> Kumar Vishal
> >>>
> >>>
> >>> On Thu, May 18, 2017 at 12:33 PM, Ravindra Pesala <
>
[hidden email]
> >>>
> >>> wrote:
> >>>
> >>>> Hi Liang,
> >>>>
> >>>> Yes Liang , it will be done in 2 parts. At first reduce the size of
> the
> >>>> btree and then merge the driver side and executor btree to single
> btree.
> >>>>
> >>>> Regards,
> >>>> Ravindra.
> >>>>
> >>>> On 17 May 2017 at 19:28, Liang Chen <
[hidden email]> wrote:
> >>>>
> >>>>> Hi Ravi
> >>>>>
> >>>>> Thank you bringing this improvement discussion to mailing list.
> >>>>>
> >>>>> One question , the point1 how to solve the below issues ? there are
> >> still
> >>>>> two part index info in driver and executor side ?
> >>>>> ------------------------------------------------------------
> >>>>> ----------------------------------------
> >>>>> And also chances of loading btree on each executor is more for every
> >>>> query
> >>>>> because there is no guarantee that same block goes to same executor
> >> every
> >>>>> time. It will be worse in case of dynamic containers.
> >>>>>
> >>>>> Regards
> >>>>> Liang
> >>>>>
> >>>>> 2017-05-17 7:33 GMT-04:00 Ravindra Pesala <
[hidden email]>:
> >>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> *1. Current problem.*
> >>>>>> 1.There is more size taking on java heap to create Btree for index
> >>>> file.
> >>>>>> It is because we create multiple objects for each leaf node so it
> >> takes
> >>>>>> more memory inside heap than actual size of index file. while doing
> >> LRU
> >>>>>> cache also we are considering only index file size instead of
> objects
> >>>>> size
> >>>>>> so it impacts the eviction process of LRU cache.
> >>>>>> 2. Currently we load one btree on driver side to find the blocks and
> >>>>> load
> >>>>>> another btree on executor side to find the blocklets. After we have
> >>>>>> increased the blocklet size to 128 mb and decrease the table_block
> >> size
> >>>>> to
> >>>>>> 256 mb the number of nodes inside driver side btree and executor
> side
> >>>>> btree
> >>>>>> is not much different. So it would be overhead to read the same
> >>>>> information
> >>>>>> twice.
> >>>>>> And also chances of loading btree on each executor is more for every
> >>>>> query
> >>>>>> because there is no guarantee that same block goes to same executor
> >>>> every
> >>>>>> time. It will be worse in case of dynamic containers.
> >>>>>>
> >>>>>> *2. Proposed solution.*
> >>>>>> 1. To reduce the java heap for Btree , we can remove the Btree data
> >>>>>> structure and use simple single array and do binary search on it.
> And
> >>>>> also
> >>>>>> we should move this cached arrays to unsafe (offheap/onheap) to
> reduce
> >>>>> the
> >>>>>> burden on GC.
> >>>>>> 2. Unify the btree to single Btree instead of 2 and load at driver
> >>>> side.
> >>>>>> So that only one lookup can be done to find the blocklets directly.
> >> And
> >>>>>> executors are not required to load the btree for every query.
> >>>>>> We can consider moving this to separate metadata service
> eventually
> >>>>>> once the memory footprint get reduced.
> >>>>>>
> >>>>>> First I will consider point 1 reduce the btree size after that I
> >>>> consider
> >>>>>> merging of btrees.
> >>>>>>
> >>>>>> Please comment on it.
> >>>>>>
> >>>>>> --
> >>>>>> Thanks & Regards,
> >>>>>> Ravindra.
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Regards
> >>>>> Liang
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Thanks & Regards,
> >>>> Ravi
> >>>>
> >>
> >>
> >>
> >>
> >
> >
> > --
> > Thanks & Regards,
> > Ravi
> >
>
>
>
>