Apache CarbonData Dev Mailing List archive

B-Tree LRU cache (New Feature)

Classic

List

Threaded

8 messages Options

mohdshahidkhan

B-Tree LRU cache (New Feature)

Hi All,

Please find the problem and proposed solution.

B-Tree LRU Cache:

Problem:

CarbonData is maintaining two level of B-Tree cache, one at the driver level and another at executor level. Currently CarbonData has the mechanism to invalidate the segments and blocks cache for the invalid table segments, but there is no eviction policy for the unused cached object. So the instance at which complete memory is utilized then the system will not be able to process any new requests.

Solution:

In the cache maintained at the driver level and at the executor there must be objects in cache currently not in use. Therefore system should have the mechanism to below mechanism.

1. Set the max memory limit till which objects could be hold in the memory.

2. When configured memory limit reached then identify the cached objects currently not in use so that the required memory could be freed without impacting the existing process.

3. Eviction should be done only till the required memory is not meet.

For details please refer to attachments.

Regards.

Shahid

mohdshahidkhan

Re: B-Tree LRU cache (New Feature)

Please find Design document for B-Tree LRU cache
https://drive.google.com/file/d/0B8sQb--59vO7bWxVeWs1ajBiMG8/view?usp=sharing

Venkata Gollamudi

Re: B-Tree LRU cache (New Feature)

Hi Shahid,

This solution, LRU cache for BTree is required to ensure to avoid out of
memory, when too many number of tables exists in store and all are not
frequently used.

Please raise an issue to track this feature.

Regards,
Ramana

On Wed, Nov 23, 2016 at 6:30 PM, mohdshahidkhan <
[hidden email]> wrote:

> Please find Design document for B-Tree LRU cache
> https://drive.google.com/file/d/0B8sQb--59vO7bWxVeWs1ajBiMG8/view?usp=
> sharing
>
>
>
> --
> View this message in context: http://apache-carbondata-
> mailing-list-archive.1130556.n5.nabble.com/B-Tree-LRU-
> cache-New-Feature-tp2366p3130.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive
> at Nabble.com.
>

sujith chacko

Re: B-Tree LRU cache (New Feature)

In reply to this post by mohdshahidkhan

Hi Shahid,

its a well explained document, just need few clarifications,

a) once compaction is done the segments and its blocks will be invalidated,
LRU's scope is to evict the unused objects from memory or least recently
used objects from memory, but after compaction the segment itself becomes
invalid,So is it really require to hold such objects in LRU cache and wait
for eviction till its memory size gets full?

Thanks,
Sujith

On Wed, Nov 23, 2016 at 6:30 PM, mohdshahidkhan <
[hidden email]> wrote:

manishgupta88

Re: B-Tree LRU cache (New Feature)

Hi Sujith,

I agree with your point. We can always send a list of invalid segments to
the executors in the query model that needs to be cleared from the cache.
But there are few cases where clearing B-tree cache cannot be ensured like:
1. Table is dropped
2. Execution of clean table DML command.

In these cases we cannot ensure that invalid objects from cache are cleared
from all the executors. Removal only from driver can be ensured.
To handle these cases each executor should have a mechanism to decide for
the invalid segments/block/dictionary cache.

Regards
Manish Gupta

On Sun, Dec 4, 2016 at 10:14 PM, sujith chacko <[hidden email]>
wrote:

> Hi Shahid,
>
> its a well explained document, just need few clarifications,
>
> a) once compaction is done the segments and its blocks will be invalidated,
> LRU's scope is to evict the unused objects from memory or least recently
> used objects from memory, but after compaction the segment itself becomes
> invalid,So is it really require to hold such objects in LRU cache and wait
> for eviction till its memory size gets full?
>
> Thanks,
> Sujith
>
> On Wed, Nov 23, 2016 at 6:30 PM, mohdshahidkhan <
> [hidden email]> wrote:
>
> > Please find Design document for B-Tree LRU cache
> > https://drive.google.com/file/d/0B8sQb--59vO7bWxVeWs1ajBiMG8/view?usp=
> > sharing
> >
> >
> >
> > --
> > View this message in context: http://apache-carbondata-
> > mailing-list-archive.1130556.n5.nabble.com/B-Tree-LRU-
> > cache-New-Feature-tp2366p3130.html
> > Sent from the Apache CarbonData Mailing List archive mailing list archive
> > at Nabble.com.
> >
>

Venkata Gollamudi

Re: B-Tree LRU cache (New Feature)

Hi Shahid,
Introduce CacheClient who is the owner for proper increment and decrement
of access count, if objects being used and not used. Other wise access
count handling becomes complicated as we add more features to system.
Regards,
Ramana

On Sun, Dec 4, 2016, 10:31 PM manish gupta <[hidden email]>
wrote:

> Hi Sujith,
>
> I agree with your point. We can always send a list of invalid segments to
> the executors in the query model that needs to be cleared from the cache.
> But there are few cases where clearing B-tree cache cannot be ensured like:
> 1. Table is dropped
> 2. Execution of clean table DML command.
>
> In these cases we cannot ensure that invalid objects from cache are cleared
> from all the executors. Removal only from driver can be ensured.
> To handle these cases each executor should have a mechanism to decide for
> the invalid segments/block/dictionary cache.
>
> Regards
> Manish Gupta
>
> On Sun, Dec 4, 2016 at 10:14 PM, sujith chacko <
> [hidden email]>
> wrote:
>
> > Hi Shahid,
> >
> > its a well explained document, just need few clarifications,
> >
> > a) once compaction is done the segments and its blocks will be
> invalidated,
> > LRU's scope is to evict the unused objects from memory or least recently
> > used objects from memory, but after compaction the segment itself becomes
> > invalid,So is it really require to hold such objects in LRU cache and
> wait
> > for eviction till its memory size gets full?
> >
> > Thanks,
> > Sujith
> >
> > On Wed, Nov 23, 2016 at 6:30 PM, mohdshahidkhan <
> > [hidden email]> wrote:
> >
> > > Please find Design document for B-Tree LRU cache
> > > https://drive.google.com/file/d/0B8sQb--59vO7bWxVeWs1ajBiMG8/view?usp=
> > > sharing
> > >
> > >
> > >
> > > --
> > > View this message in context: http://apache-carbondata-
> > > mailing-list-archive.1130556.n5.nabble.com/B-Tree-LRU-
> > > cache-New-Feature-tp2366p3130.html
> > > Sent from the Apache CarbonData Mailing List archive mailing list
> archive
> > > at Nabble.com.
> > >
> >
>

jarray888

Re: B-Tree LRU cache (New Feature)

mohdshahidkhan

Re: B-Tree LRU cache (New Feature)

In reply to this post by Venkata Gollamudi

Hi Sujith,
I agree with your that after compaction there is no use of having the segments as well as block
cache, We should have the mechanism to invalidate the compacted segments cache from driver and
block level cache from the executor.