[DISCUSSION] Removal of access count based removal from LRU cache

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSSION] Removal of access count based removal from LRU cache

manishgupta88
Hi All,

Currently in carbondata we have LRU caching feature for maintaing BTree and
Dictionary cache. This feature is helpful in low end systems where memory
is less or where the user want to have a control on the memory to be used
by carbondata system for caching.

In LRU cache for every key in cache map an atomic access count variable is
maintained which is incremented when a query accesses that key and
decremented after its usage is complete by that query.

There are many places where we access dictionary columns like decoding
values for result preparation, filter operation, data loading, etc and it
becomes a cumbersome process to maintain the access count during each
access at entry and exit level for the operation. If there is any
inconsistency in incrementing and decrementing access count, the
corresponding key from the caching map will never be cleared from the
memory and if space is not freed queries will start failing with
unavailable memory exception.

Therefore I suggest the following behavior.
1. Remove access count based removal from the caching framework and make
the framework properly LRU based removal.
2. Ensure that for one query BTree and dictionary cache is accessed atmost
once by driver and executor.
3. Fail the query if the size required by the dictionary column or BTree is
more than the size configured by the user for LRU cache. This is the
because the user should be clear that the size for caching need to be
increased and carbondata system is not taking any run time decision.

Please share your inputs on this.

Regards
Manish Gupta
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Removal of access count based removal from LRU cache

ravipesala
Looks fine, but failing the query when memory is not enough to fit to the
LRU cache is not a good idea. In LRU caching terminology there is no
question of failing the queries. we just evict the old data from cache if
memory is not sufficient to add latest one. we should log this eviction and
adding so that user can analyse whether he should increase the cache size
or not.

Regards,
Ravindra.

On 15 May 2017 at 11:54, manish gupta <[hidden email]> wrote:

> Hi All,
>
> Currently in carbondata we have LRU caching feature for maintaing BTree and
> Dictionary cache. This feature is helpful in low end systems where memory
> is less or where the user want to have a control on the memory to be used
> by carbondata system for caching.
>
> In LRU cache for every key in cache map an atomic access count variable is
> maintained which is incremented when a query accesses that key and
> decremented after its usage is complete by that query.
>
> There are many places where we access dictionary columns like decoding
> values for result preparation, filter operation, data loading, etc and it
> becomes a cumbersome process to maintain the access count during each
> access at entry and exit level for the operation. If there is any
> inconsistency in incrementing and decrementing access count, the
> corresponding key from the caching map will never be cleared from the
> memory and if space is not freed queries will start failing with
> unavailable memory exception.
>
> Therefore I suggest the following behavior.
> 1. Remove access count based removal from the caching framework and make
> the framework properly LRU based removal.
> 2. Ensure that for one query BTree and dictionary cache is accessed atmost
> once by driver and executor.
> 3. Fail the query if the size required by the dictionary column or BTree is
> more than the size configured by the user for LRU cache. This is the
> because the user should be clear that the size for caching need to be
> increased and carbondata system is not taking any run time decision.
>
> Please share your inputs on this.
>
> Regards
> Manish Gupta
>



--
Thanks & Regards,
Ravi