B-Tree LRU cache (New Feature)

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

B-Tree LRU cache (New Feature)

mohdshahidkhan
Hi All,
Please find the problem and proposed solution.

B-Tree LRU Cache:

Problem:

CarbonData is maintaining two level of B-Tree cache, one at the driver level and another at executor level.  Currently CarbonData has the mechanism to invalidate the segments and blocks cache for the invalid table segments, but there is no eviction policy for the unused cached object. So the instance at which complete memory is utilized then the system will not be able to process any new requests.

Solution:

In the cache maintained at the driver level and at the executor there must be objects in cache currently not in use. Therefore system should have the mechanism to below mechanism.

1.       Set the max memory limit till which objects could be hold in the memory.

2.       When configured memory limit reached then identify the cached objects currently not in use so that the required memory could be freed without impacting the existing process.

3.       Eviction should be done only till the required memory is not meet.

For details please refer to attachments.


Regards.

Shahid

Reply | Threaded
Open this post in threaded view
|

Re: B-Tree LRU cache (New Feature)

mohdshahidkhan
Reply | Threaded
Open this post in threaded view
|

Re: B-Tree LRU cache (New Feature)

Venkata Gollamudi
Hi Shahid,

This solution, LRU cache for BTree is required to ensure to avoid out of
memory, when too many number of tables exists in store and all are not
frequently used.

Please raise an issue to track this feature.

Regards,
Ramana

On Wed, Nov 23, 2016 at 6:30 PM, mohdshahidkhan <
[hidden email]> wrote:

> Please find Design document for B-Tree LRU cache
> https://drive.google.com/file/d/0B8sQb--59vO7bWxVeWs1ajBiMG8/view?usp=
> sharing
>
>
>
> --
> View this message in context: http://apache-carbondata-
> mailing-list-archive.1130556.n5.nabble.com/B-Tree-LRU-
> cache-New-Feature-tp2366p3130.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive
> at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: B-Tree LRU cache (New Feature)

sujith chacko
In reply to this post by mohdshahidkhan
Hi Shahid,

   its a well explained document, just need few clarifications,

a) once compaction is done the segments and its blocks will be invalidated,
LRU's scope is to evict the unused objects from memory or  least recently
used objects from memory, but after compaction the segment itself becomes
invalid,So is it really require to hold such objects in LRU cache and wait
for eviction  till its memory size gets full?

Thanks,
Sujith

On Wed, Nov 23, 2016 at 6:30 PM, mohdshahidkhan <
[hidden email]> wrote:

> Please find Design document for B-Tree LRU cache
> https://drive.google.com/file/d/0B8sQb--59vO7bWxVeWs1ajBiMG8/view?usp=
> sharing
>
>
>
> --
> View this message in context: http://apache-carbondata-
> mailing-list-archive.1130556.n5.nabble.com/B-Tree-LRU-
> cache-New-Feature-tp2366p3130.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive
> at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: B-Tree LRU cache (New Feature)

manishgupta88
Hi Sujith,

I agree with your point. We can always send a list of invalid segments to
the executors in the query model that needs to be cleared from the cache.
But there are few cases where clearing B-tree cache cannot be ensured like:
1. Table is dropped
2. Execution of clean table DML command.

In these cases we cannot ensure that invalid objects from cache are cleared
from all the executors. Removal only from driver can be ensured.
To handle these cases each executor should have a mechanism to decide for
the invalid segments/block/dictionary cache.

Regards
Manish Gupta

On Sun, Dec 4, 2016 at 10:14 PM, sujith chacko <[hidden email]>
wrote:

> Hi Shahid,
>
>    its a well explained document, just need few clarifications,
>
> a) once compaction is done the segments and its blocks will be invalidated,
> LRU's scope is to evict the unused objects from memory or  least recently
> used objects from memory, but after compaction the segment itself becomes
> invalid,So is it really require to hold such objects in LRU cache and wait
> for eviction  till its memory size gets full?
>
> Thanks,
> Sujith
>
> On Wed, Nov 23, 2016 at 6:30 PM, mohdshahidkhan <
> [hidden email]> wrote:
>
> > Please find Design document for B-Tree LRU cache
> > https://drive.google.com/file/d/0B8sQb--59vO7bWxVeWs1ajBiMG8/view?usp=
> > sharing
> >
> >
> >
> > --
> > View this message in context: http://apache-carbondata-
> > mailing-list-archive.1130556.n5.nabble.com/B-Tree-LRU-
> > cache-New-Feature-tp2366p3130.html
> > Sent from the Apache CarbonData Mailing List archive mailing list archive
> > at Nabble.com.
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: B-Tree LRU cache (New Feature)

Venkata Gollamudi
Hi Shahid,
Introduce CacheClient who is the owner for proper increment and decrement
of access count, if objects being used and not used. Other wise access
count handling becomes complicated as we add more features to system.
Regards,
Ramana

On Sun, Dec 4, 2016, 10:31 PM manish gupta <[hidden email]>
wrote:

> Hi Sujith,
>
> I agree with your point. We can always send a list of invalid segments to
> the executors in the query model that needs to be cleared from the cache.
> But there are few cases where clearing B-tree cache cannot be ensured like:
> 1. Table is dropped
> 2. Execution of clean table DML command.
>
> In these cases we cannot ensure that invalid objects from cache are cleared
> from all the executors. Removal only from driver can be ensured.
> To handle these cases each executor should have a mechanism to decide for
> the invalid segments/block/dictionary cache.
>
> Regards
> Manish Gupta
>
> On Sun, Dec 4, 2016 at 10:14 PM, sujith chacko <
> [hidden email]>
> wrote:
>
> > Hi Shahid,
> >
> >    its a well explained document, just need few clarifications,
> >
> > a) once compaction is done the segments and its blocks will be
> invalidated,
> > LRU's scope is to evict the unused objects from memory or  least recently
> > used objects from memory, but after compaction the segment itself becomes
> > invalid,So is it really require to hold such objects in LRU cache and
> wait
> > for eviction  till its memory size gets full?
> >
> > Thanks,
> > Sujith
> >
> > On Wed, Nov 23, 2016 at 6:30 PM, mohdshahidkhan <
> > [hidden email]> wrote:
> >
> > > Please find Design document for B-Tree LRU cache
> > > https://drive.google.com/file/d/0B8sQb--59vO7bWxVeWs1ajBiMG8/view?usp=
> > > sharing
> > >
> > >
> > >
> > > --
> > > View this message in context: http://apache-carbondata-
> > > mailing-list-archive.1130556.n5.nabble.com/B-Tree-LRU-
> > > cache-New-Feature-tp2366p3130.html
> > > Sent from the Apache CarbonData Mailing List archive mailing list
> archive
> > > at Nabble.com.
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: B-Tree LRU cache (New Feature)

jarray888
+1
Reply | Threaded
Open this post in threaded view
|

Re: B-Tree LRU cache (New Feature)

mohdshahidkhan
In reply to this post by Venkata Gollamudi
Hi Sujith,
I agree with your that after compaction there is no use of having the segments as well as block
cache, We should have the mechanism to invalidate the compacted segments cache from driver and
block level cache from the executor.