[Vote] Please provide valuable feedback's and vote for Like filter query performance optimization

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[Vote] Please provide valuable feedback's and vote for Like filter query performance optimization

sujith chacko
Hi All,

  I am going to  optimize the LIKE Filter query flow for no-dictionary
columns, please find the details mentioned below.

*Current design:*
For Like filter queries no push down is happening to carbon layer, because
of this there will be no block/blocklet level pruning which can happen
before applying the LIKE filters, this can add overhead while scanning
since the system has to scan all the blocks and blocklets in order to apply
filters.

*Proposed design/solution:*
Like filters(startsWith,endsWith,contains) can be pushed to carbon engine
layer so that carbon can perform block and blocklet level pruning inorder
before applying filters.
Block level pruning will be happening in driver side and blocklet level
pruning will be done in executer as per existing design.

Requesting all to please provide valuable feedback and vote for
implementing the above solution inorder to  improve Like Filter Queries.

Thanks,
Sujith
Reply | Threaded
Open this post in threaded view
|

Re: [Vote] Please provide valuable feedback's and vote for Like filter query performance optimization

Liang Chen
Administrator
Hi

I have one query : for no dictionary columns which are high cardinality
like phone number, Whether the pruning cost is hight,or not ?

Regards
Liang

2016-11-14 15:18 GMT+08:00 sujith chacko <[hidden email]>:

> Hi All,
>
>   I am going to  optimize the LIKE Filter query flow for no-dictionary
> columns, please find the details mentioned below.
>
> *Current design:*
> For Like filter queries no push down is happening to carbon layer, because
> of this there will be no block/blocklet level pruning which can happen
> before applying the LIKE filters, this can add overhead while scanning
> since the system has to scan all the blocks and blocklets in order to apply
> filters.
>
> *Proposed design/solution:*
> Like filters(startsWith,endsWith,contains) can be pushed to carbon engine
> layer so that carbon can perform block and blocklet level pruning inorder
> before applying filters.
> Block level pruning will be happening in driver side and blocklet level
> pruning will be done in executer as per existing design.
>
> Requesting all to please provide valuable feedback and vote for
> implementing the above solution inorder to  improve Like Filter Queries.
>
> Thanks,
> Sujith
>



--
Regards
Liang
Reply | Threaded
Open this post in threaded view
|

Re: [Vote] Please provide valuable feedback's and vote for Like filter query performance optimization

kumarvishal09
+1
Hi Liang,
Pruning cost won't be high as block pruning will be done at complete btree
level and it will improve query performance for no dictionary column.

-Regards
Kumar Vishal

On Nov 14, 2016 14:01, "Liang Chen" <[hidden email]> wrote:

> Hi
>
> I have one query : for no dictionary columns which are high cardinality
> like phone number, Whether the pruning cost is hight,or not ?
>
> Regards
> Liang
>
> 2016-11-14 15:18 GMT+08:00 sujith chacko <[hidden email]>:
>
> > Hi All,
> >
> >   I am going to  optimize the LIKE Filter query flow for no-dictionary
> > columns, please find the details mentioned below.
> >
> > *Current design:*
> > For Like filter queries no push down is happening to carbon layer,
> because
> > of this there will be no block/blocklet level pruning which can happen
> > before applying the LIKE filters, this can add overhead while scanning
> > since the system has to scan all the blocks and blocklets in order to
> apply
> > filters.
> >
> > *Proposed design/solution:*
> > Like filters(startsWith,endsWith,contains) can be pushed to carbon
> engine
> > layer so that carbon can perform block and blocklet level pruning inorder
> > before applying filters.
> > Block level pruning will be happening in driver side and blocklet level
> > pruning will be done in executer as per existing design.
> >
> > Requesting all to please provide valuable feedback and vote for
> > implementing the above solution inorder to  improve Like Filter Queries.
> >
> > Thanks,
> > Sujith
> >
>
>
>
> --
> Regards
> Liang
>
kumar vishal
Reply | Threaded
Open this post in threaded view
|

Re: [Vote] Please provide valuable feedback's and vote for Like filter query performance optimization

sujith chacko
In reply to this post by Liang Chen
Hi liang,
Yes,  its for high cardinality columns.
Thanks,
Sujith

On Nov 14, 2016 2:01 PM, "Liang Chen" <[hidden email]> wrote:

> Hi
>
> I have one query : for no dictionary columns which are high cardinality
> like phone number, Whether the pruning cost is hight,or not ?
>
> Regards
> Liang
>
> 2016-11-14 15:18 GMT+08:00 sujith chacko <[hidden email]>:
>
> > Hi All,
> >
> >   I am going to  optimize the LIKE Filter query flow for no-dictionary
> > columns, please find the details mentioned below.
> >
> > *Current design:*
> > For Like filter queries no push down is happening to carbon layer,
> because
> > of this there will be no block/blocklet level pruning which can happen
> > before applying the LIKE filters, this can add overhead while scanning
> > since the system has to scan all the blocks and blocklets in order to
> apply
> > filters.
> >
> > *Proposed design/solution:*
> > Like filters(startsWith,endsWith,contains) can be pushed to carbon
> engine
> > layer so that carbon can perform block and blocklet level pruning inorder
> > before applying filters.
> > Block level pruning will be happening in driver side and blocklet level
> > pruning will be done in executer as per existing design.
> >
> > Requesting all to please provide valuable feedback and vote for
> > implementing the above solution inorder to  improve Like Filter Queries.
> >
> > Thanks,
> > Sujith
> >
>
>
>
> --
> Regards
> Liang
>
Reply | Threaded
Open this post in threaded view
|

Re: [Vote] Please provide valuable feedback's and vote for Like filter query performance optimization

ravipesala
+1

On Mon, Nov 14, 2016, 3:54 PM sujith chacko <[hidden email]>
wrote:

> Hi liang,
> Yes,  its for high cardinality columns.
> Thanks,
> Sujith
>
> On Nov 14, 2016 2:01 PM, "Liang Chen" <[hidden email]> wrote:
>
> > Hi
> >
> > I have one query : for no dictionary columns which are high cardinality
> > like phone number, Whether the pruning cost is hight,or not ?
> >
> > Regards
> > Liang
> >
> > 2016-11-14 15:18 GMT+08:00 sujith chacko <[hidden email]>:
> >
> > > Hi All,
> > >
> > >   I am going to  optimize the LIKE Filter query flow for no-dictionary
> > > columns, please find the details mentioned below.
> > >
> > > *Current design:*
> > > For Like filter queries no push down is happening to carbon layer,
> > because
> > > of this there will be no block/blocklet level pruning which can happen
> > > before applying the LIKE filters, this can add overhead while scanning
> > > since the system has to scan all the blocks and blocklets in order to
> > apply
> > > filters.
> > >
> > > *Proposed design/solution:*
> > > Like filters(startsWith,endsWith,contains) can be pushed to carbon
> > engine
> > > layer so that carbon can perform block and blocklet level pruning
> inorder
> > > before applying filters.
> > > Block level pruning will be happening in driver side and blocklet level
> > > pruning will be done in executer as per existing design.
> > >
> > > Requesting all to please provide valuable feedback and vote for
> > > implementing the above solution inorder to  improve Like Filter
> Queries.
> > >
> > > Thanks,
> > > Sujith
> > >
> >
> >
> >
> > --
> > Regards
> > Liang
> >
>