[jira] [Closed] (CARBONDATA-3807) Filter queries and projection queries with bloom columns are not hitting the bloom datamap.

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Closed] (CARBONDATA-3807) Filter queries and projection queries with bloom columns are not hitting the bloom datamap.

Akash R Nilugal (Jira)

     [ https://issues.apache.org/jira/browse/CARBONDATA-3807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Prasanna Ravichandran closed CARBONDATA-3807.
---------------------------------------------
    Fix Version/s: 2.0.0
       Resolution: Not A Bug

After adding the enable.query.statistics and then in the plan verification, we could see the Bloom filter related details in the explain query. This will be seen in plan, only after the create bloom index + load happens. With only create bloom index, it is not happening in plan. 

> Filter queries and projection queries with bloom columns are not hitting the bloom datamap.
> -------------------------------------------------------------------------------------------
>
>                 Key: CARBONDATA-3807
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-3807
>             Project: CarbonData
>          Issue Type: Bug
>         Environment: Ant cluster - opensource
>            Reporter: Prasanna Ravichandran
>            Priority: Major
>             Fix For: 2.0.0
>
>         Attachments: bloom-filtercolumn-plan.png, bloom-show index.png
>
>
> Filter queries and projection queries with bloom columns are not hitting the bloom datamap.
>  Bloom datamap is unused as per plan, even though created.
> Test queries: 
> drop table if exists uniqdata;
>  CREATE TABLE uniqdata (cust_id int,cust_name String,active_emui_version string, dob timestamp, doj timestamp, bigint_column1 bigint,bigint_column2 bigint,decimal_column1 decimal(30,10), decimal_column2 decimal(36,36),double_column1 double, double_column2 double,integer_column1 int) stored as carbondata;
>  load data inpath 'hdfs://hacluster/user/prasanna/2000_UniqData.csv' into table uniqdata options('fileheader'='cust_id,cust_name,active_emui_version,dob,doj,bigint_column1,bigint_column2,decimal_column1,decimal_column2,double_column1,double_column2,integer_column1','bad_records_action'='force');
> create datamap datamapuniq_b1 on table uniqdata(cust_name) as 'bloomfilter' PROPERTIES ('BLOOM_SIZE'='640000', 'BLOOM_FPP'='0.00001');
> show indexes on uniqdata;
> explain select count(*) from uniqdata where cust_name="CUST_NAME_00000"; --not hitting;
> explain select cust_name from uniqdata; --not hitting;
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)