Login  Register

Re: [DISCUSSION] Cache Pre Priming

Posted by akashnilugal@gmail.com on Aug 21, 2019; 9:42am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/DISCUSSION-Cache-Pre-Priming-tp83559p83660.html

Hi Litao,

Initially with first time count(*) , it used to take around 32seconds as it used to load into cache, and second time query takes 1.5sec to 2 i think, so with pre-prime we can achieve more improvement in first time query.

Regards,
Akash

On 2019/08/21 03:03:55, tao li <[hidden email]> wrote:

> hi Akash,
>       Before development, we need to know how much improvement can be made to queries by caching part of the index in advance.
>       We need to compare the first and second query and analyze them. We need to find time differences for several important steps.
>       It can analyze the performance improvement that can be brought by caching part of the index in advance.
>
> On 2019/08/15 12:03:09, Akash Nilugal <[hidden email]> wrote:
> > Hi Community,
> >
> > Currently, we have an index server which basically helps in distributed
> > caching of the datamaps in a separate spark application.
> >
> > The caching of the datamaps in index server will start once the query is
> > fired on the table for the first time, all the datamaps will be loaded
> >
> > if the count(*) is fired and only required will be loaded for any filter
> > query.
> >
> >
> > Here the problem or the bottleneck is, until and unless the query is fired
> > on table, the caching won’t be done for the table datamaps.
> >
> > So consider a scenario where we are just loading the data to table for
> > whole day and then next day we query,
> >
> > so all the segments will start loading into cache. So first time the query
> > will be slow.
> >
> >
> > What if we load the datamaps into cache or preprime the cache without
> > waititng for any query on the table?
> >
> > Yes, what if we load the cache after every load is done, what if we load
> > the cache for all the segments at once,
> >
> > so that first time query need not do all this job, which makes it faster.
> >
> >
> > Here i have attached the design document for the pre-priming of cache into
> > index server. Please have a look at it
> >
> > and any suggestions or inputs on this are most welcomed.
> >
> >
> > https://drive.google.com/file/d/1YUpDUv7ZPUyZQQYwQYcQK2t2aBQH18PB/view?usp=sharing
> >
> >
> >
> > Regards,
> >
> > Akash R Nilugal
> >
>