Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[jira] [Closed] (CARBONDATA-3807) Filter queries and projection queries with bloom columns are not hitting the bloom datamap.

Classic

List

Threaded

1 message

Akash R Nilugal (Jira)

Oct 14, 2020; 7:03am

[jira] [Closed] (CARBONDATA-3807) Filter queries and projection queries with bloom columns are not hitting the bloom datamap.

[ https://issues.apache.org/jira/browse/CARBONDATA-3807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Prasanna Ravichandran closed CARBONDATA-3807.
---------------------------------------------
Fix Version/s: 2.0.0
Resolution: Not A Bug

After adding the enable.query.statistics and then in the plan verification, we could see the Bloom filter related details in the explain query. This will be seen in plan, only after the create bloom index + load happens. With only create bloom index, it is not happening in plan.

> Filter queries and projection queries with bloom columns are not hitting the bloom datamap.
> -------------------------------------------------------------------------------------------
>
> Key: CARBONDATA-3807
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3807
> Project: CarbonData
> Issue Type: Bug
> Environment: Ant cluster - opensource
> Reporter: Prasanna Ravichandran
> Priority: Major
> Fix For: 2.0.0
>
> Attachments: bloom-filtercolumn-plan.png, bloom-show index.png
>
>
> Filter queries and projection queries with bloom columns are not hitting the bloom datamap.
> Bloom datamap is unused as per plan, even though created.
> Test queries:
> drop table if exists uniqdata;
> CREATE TABLE uniqdata (cust_id int,cust_name String,active_emui_version string, dob timestamp, doj timestamp, bigint_column1 bigint,bigint_column2 bigint,decimal_column1 decimal(30,10), decimal_column2 decimal(36,36),double_column1 double, double_column2 double,integer_column1 int) stored as carbondata;
> load data inpath 'hdfs://hacluster/user/prasanna/2000_UniqData.csv' into table uniqdata options('fileheader'='cust_id,cust_name,active_emui_version,dob,doj,bigint_column1,bigint_column2,decimal_column1,decimal_column2,double_column1,double_column2,integer_column1','bad_records_action'='force');
> create datamap datamapuniq_b1 on table uniqdata(cust_name) as 'bloomfilter' PROPERTIES ('BLOOM_SIZE'='640000', 'BLOOM_FPP'='0.00001');
> show indexes on uniqdata;
> explain select count(*) from uniqdata where cust_name="CUST_NAME_00000"; --not hitting;
> explain select cust_name from uniqdata; --not hitting;
>
>

... [show rest of quote]

--
This message was sent by Atlassian Jira
(v8.3.4#803005)