Apache CarbonData Dev Mailing List archive - Re: [Vote] Please provide valuable feedback's and vote for Like filter query performance optimization

Apache CarbonData Dev Mailing List archive

Re: [Vote] Please provide valuable feedback's and vote for Like filter query performance optimization

Posted by sujith chacko on
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Vote-Please-provide-valuable-feedback-s-and-vote-for-Like-filter-query-performance-optimization-tp2893p2902.html

Hi liang,
Yes, its for high cardinality columns.
Thanks,
Sujith

On Nov 14, 2016 2:01 PM, "Liang Chen" <[hidden email]> wrote:

> Hi
>
> I have one query : for no dictionary columns which are high cardinality
> like phone number, Whether the pruning cost is hight,or not ?
>
> Regards
> Liang
>
> 2016-11-14 15:18 GMT+08:00 sujith chacko <[hidden email]>:
>
> > Hi All,
> >
> > I am going to optimize the LIKE Filter query flow for no-dictionary
> > columns, please find the details mentioned below.
> >
> > *Current design:*
> > For Like filter queries no push down is happening to carbon layer,
> because
> > of this there will be no block/blocklet level pruning which can happen
> > before applying the LIKE filters, this can add overhead while scanning
> > since the system has to scan all the blocks and blocklets in order to
> apply
> > filters.
> >
> > *Proposed design/solution:*
> > Like filters(startsWith,endsWith,contains) can be pushed to carbon
> engine
> > layer so that carbon can perform block and blocklet level pruning inorder
> > before applying filters.
> > Block level pruning will be happening in driver side and blocklet level
> > pruning will be done in executer as per existing design.
> >
> > Requesting all to please provide valuable feedback and vote for
> > implementing the above solution inorder to improve Like Filter Queries.
> >
> > Thanks,
> > Sujith
> >
>
>
>
> --
> Regards
> Liang
>