Hi All,
Please find the problem and proposed solution. B-Tree LRU Cache: Problem: CarbonData is maintaining two level of B-Tree cache, one at the driver level and another at executor level. Currently CarbonData has the mechanism to invalidate the segments and blocks cache for the invalid table segments, but there is no eviction policy for the unused cached object. So the instance at which complete memory is utilized then the system will not be able to process any new requests. Solution: In the cache maintained at the driver level and at the executor there must be objects in cache currently not in use. Therefore system should have the mechanism to below mechanism. 1. Set the max memory limit till which objects could be hold in the memory. 2. When configured memory limit reached then identify the cached objects currently not in use so that the required memory could be freed without impacting the existing process. 3. Eviction should be done only till the required memory is not meet. For details please refer to attachments. Regards. Shahid |
Please find Design document for B-Tree LRU cache
https://drive.google.com/file/d/0B8sQb--59vO7bWxVeWs1ajBiMG8/view?usp=sharing |
Hi Shahid,
This solution, LRU cache for BTree is required to ensure to avoid out of memory, when too many number of tables exists in store and all are not frequently used. Please raise an issue to track this feature. Regards, Ramana On Wed, Nov 23, 2016 at 6:30 PM, mohdshahidkhan < [hidden email]> wrote: > Please find Design document for B-Tree LRU cache > https://drive.google.com/file/d/0B8sQb--59vO7bWxVeWs1ajBiMG8/view?usp= > sharing > > > > -- > View this message in context: http://apache-carbondata- > mailing-list-archive.1130556.n5.nabble.com/B-Tree-LRU- > cache-New-Feature-tp2366p3130.html > Sent from the Apache CarbonData Mailing List archive mailing list archive > at Nabble.com. > |
In reply to this post by mohdshahidkhan
Hi Shahid,
its a well explained document, just need few clarifications, a) once compaction is done the segments and its blocks will be invalidated, LRU's scope is to evict the unused objects from memory or least recently used objects from memory, but after compaction the segment itself becomes invalid,So is it really require to hold such objects in LRU cache and wait for eviction till its memory size gets full? Thanks, Sujith On Wed, Nov 23, 2016 at 6:30 PM, mohdshahidkhan < [hidden email]> wrote: > Please find Design document for B-Tree LRU cache > https://drive.google.com/file/d/0B8sQb--59vO7bWxVeWs1ajBiMG8/view?usp= > sharing > > > > -- > View this message in context: http://apache-carbondata- > mailing-list-archive.1130556.n5.nabble.com/B-Tree-LRU- > cache-New-Feature-tp2366p3130.html > Sent from the Apache CarbonData Mailing List archive mailing list archive > at Nabble.com. > |
Hi Sujith,
I agree with your point. We can always send a list of invalid segments to the executors in the query model that needs to be cleared from the cache. But there are few cases where clearing B-tree cache cannot be ensured like: 1. Table is dropped 2. Execution of clean table DML command. In these cases we cannot ensure that invalid objects from cache are cleared from all the executors. Removal only from driver can be ensured. To handle these cases each executor should have a mechanism to decide for the invalid segments/block/dictionary cache. Regards Manish Gupta On Sun, Dec 4, 2016 at 10:14 PM, sujith chacko <[hidden email]> wrote: > Hi Shahid, > > its a well explained document, just need few clarifications, > > a) once compaction is done the segments and its blocks will be invalidated, > LRU's scope is to evict the unused objects from memory or least recently > used objects from memory, but after compaction the segment itself becomes > invalid,So is it really require to hold such objects in LRU cache and wait > for eviction till its memory size gets full? > > Thanks, > Sujith > > On Wed, Nov 23, 2016 at 6:30 PM, mohdshahidkhan < > [hidden email]> wrote: > > > Please find Design document for B-Tree LRU cache > > https://drive.google.com/file/d/0B8sQb--59vO7bWxVeWs1ajBiMG8/view?usp= > > sharing > > > > > > > > -- > > View this message in context: http://apache-carbondata- > > mailing-list-archive.1130556.n5.nabble.com/B-Tree-LRU- > > cache-New-Feature-tp2366p3130.html > > Sent from the Apache CarbonData Mailing List archive mailing list archive > > at Nabble.com. > > > |
Hi Shahid,
Introduce CacheClient who is the owner for proper increment and decrement of access count, if objects being used and not used. Other wise access count handling becomes complicated as we add more features to system. Regards, Ramana On Sun, Dec 4, 2016, 10:31 PM manish gupta <[hidden email]> wrote: > Hi Sujith, > > I agree with your point. We can always send a list of invalid segments to > the executors in the query model that needs to be cleared from the cache. > But there are few cases where clearing B-tree cache cannot be ensured like: > 1. Table is dropped > 2. Execution of clean table DML command. > > In these cases we cannot ensure that invalid objects from cache are cleared > from all the executors. Removal only from driver can be ensured. > To handle these cases each executor should have a mechanism to decide for > the invalid segments/block/dictionary cache. > > Regards > Manish Gupta > > On Sun, Dec 4, 2016 at 10:14 PM, sujith chacko < > [hidden email]> > wrote: > > > Hi Shahid, > > > > its a well explained document, just need few clarifications, > > > > a) once compaction is done the segments and its blocks will be > invalidated, > > LRU's scope is to evict the unused objects from memory or least recently > > used objects from memory, but after compaction the segment itself becomes > > invalid,So is it really require to hold such objects in LRU cache and > wait > > for eviction till its memory size gets full? > > > > Thanks, > > Sujith > > > > On Wed, Nov 23, 2016 at 6:30 PM, mohdshahidkhan < > > [hidden email]> wrote: > > > > > Please find Design document for B-Tree LRU cache > > > https://drive.google.com/file/d/0B8sQb--59vO7bWxVeWs1ajBiMG8/view?usp= > > > sharing > > > > > > > > > > > > -- > > > View this message in context: http://apache-carbondata- > > > mailing-list-archive.1130556.n5.nabble.com/B-Tree-LRU- > > > cache-New-Feature-tp2366p3130.html > > > Sent from the Apache CarbonData Mailing List archive mailing list > archive > > > at Nabble.com. > > > > > > |
+1
|
In reply to this post by Venkata Gollamudi
Hi Sujith,
I agree with your that after compaction there is no use of having the segments as well as block cache, We should have the mechanism to invalidate the compacted segments cache from driver and block level cache from the executor. |
Free forum by Nabble | Edit this page |