Login  Register

Re: [DISCUSSION] Distributed Index Cache Server

Posted by kunalkapoor on Feb 05, 2019; 11:04am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/DISCUSSION-Distributed-Index-Cache-Server-tp75008p75005.html

+ JIRA link for tracking purpose


On Tue, Feb 5, 2019 at 4:27 PM Kunal Kapoor <[hidden email]>
wrote:

> Hi All,
>
> Carbon currently caches all block/blocklet datamap index information into
> the driver. And for bloom type of datamap, it can prune the splits in a
> distributed way using distributed datamap pruning. In the first case, there
> are limitations like driver memory scale up and reusing of one driver cache
> by others is not possible. In the second case, there are limitations like
> there is no guarantee that the next query goes to the same executor to
> reuse the cache.
>
>
> Based on the above problems there is a need to have a centralised index
> cache server.
>
>
> Please find below the link for the design document.
>
>
>
> https://docs.google.com/document/d/161NXxrKLPucIExkWip5mX00x2iOPH6bvsuQnCzzp47E/edit?ts=5c542ab4#heading=h.x0qaehgkncz5
>
>
>
> Thanks
>
> Kunal Kapoor
>
>
>