Apache CarbonData Dev Mailing List archive - Re: [Vote] Please provide valuable feedback's and vote for Like filter query performance optimization

Apache CarbonData Dev Mailing List archive

Re: [Vote] Please provide valuable feedback's and vote for Like filter query performance optimization

Posted by Liang Chen on
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Vote-Please-provide-valuable-feedback-s-and-vote-for-Like-filter-query-performance-optimization-tp2893p2895.html

Hi

I have one query : for no dictionary columns which are high cardinality
like phone number, Whether the pruning cost is hight,or not ?

Regards
Liang

2016-11-14 15:18 GMT+08:00 sujith chacko <[hidden email]>:

> Hi All,
>
> I am going to optimize the LIKE Filter query flow for no-dictionary
> columns, please find the details mentioned below.
>
> *Current design:*
> For Like filter queries no push down is happening to carbon layer, because
> of this there will be no block/blocklet level pruning which can happen
> before applying the LIKE filters, this can add overhead while scanning
> since the system has to scan all the blocks and blocklets in order to apply
> filters.
>
> *Proposed design/solution:*
> Like filters(startsWith,endsWith,contains) can be pushed to carbon engine
> layer so that carbon can perform block and blocklet level pruning inorder
> before applying filters.
> Block level pruning will be happening in driver side and blocklet level
> pruning will be done in executer as per existing design.
>
> Requesting all to please provide valuable feedback and vote for
> implementing the above solution inorder to improve Like Filter Queries.
>
> Thanks,
> Sujith
>

--
Regards
Liang