[jira] [Commented] (CARBONDATA-2537) MV Dataset - User queries with 'having' condition is not accessing the data from the MV datamap.

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (CARBONDATA-2537) MV Dataset - User queries with 'having' condition is not accessing the data from the MV datamap.

Akash R Nilugal (Jira)

    [ https://issues.apache.org/jira/browse/CARBONDATA-2537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16523570#comment-16523570 ]

xubo245 commented on CARBONDATA-2537:
-------------------------------------

It will match datamap if there is rebuild for datamap :


{code:java}
0: jdbc:hive2://127.0.0.1:10000> create datamap mv_hav using 'mv' as select empno from originTable having salary>10000;
+---------+--+
| Result  |
+---------+--+
+---------+--+
No rows selected (0.085 seconds)
0: jdbc:hive2://127.0.0.1:10000> rebuild datamap mv_hav;
+---------+--+
| Result  |
+---------+--+
+---------+--+
No rows selected (0.131 seconds)
0: jdbc:hive2://127.0.0.1:10000> explain select empno from originTable having salary>10000;
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
|                                                                                                                                               plan                                                                                                                                               |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
| == CarbonData Profiler ==
                                                                                                                                                                                                                                                                       |
| == Physical Plan ==
*Project [origintable_empno#7961 AS empno#7985]
+- *BatchedScan CarbonDatasourceHadoopRelation [ Database name :default, Table name :mv_hav_table, Schema :Some(StructType(StructField(origintable_empno,IntegerType,true))) ] default.mv_hav_table[origintable_empno#7961]  |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
2 rows selected (0.071 seconds)
0: jdbc:hive2://127.0.0.1:10000>  select empno from originTable having salary>10000;
+--------+--+
| empno  |
+--------+--+
+--------+--+
No rows selected (0.066 seconds)

{code}


> MV Dataset - User queries with 'having' condition is not accessing the data from the MV datamap.
> ------------------------------------------------------------------------------------------------
>
>                 Key: CARBONDATA-2537
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-2537
>             Project: CarbonData
>          Issue Type: Bug
>          Components: data-query
>         Environment: 3 Node Opensource ANT cluster.
>            Reporter: Prasanna Ravichandran
>            Assignee: xubo245
>            Priority: Minor
>              Labels: Carbondata, MV, Materialistic_Views
>         Attachments: data.csv, image-2018-05-25-15-50-23-903.png
>
>
> User queries with 'having' condition is not accessing the data from the MV datamap. It is accessing the data from the Main table.
> Test queries - spark shell:
> scala>carbon.sql("CREATE TABLE originTable (empno int, empname String, designation String, doj Timestamp, workgroupcategory int, workgroupcategoryname String, deptno int, deptname String, projectcode int, projectjoindate Timestamp, projectenddate Timestamp,attendance int, utilization int,salary int) STORED BY 'org.apache.carbondata.format'").show()
> ++
> ||
> ++
> ++
> scala>carbon.sql("LOAD DATA local inpath 'hdfs://hacluster/user/prasanna/data.csv' INTO TABLE originTable OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '\"','timestampformat'='dd-MM-yyyy')").show()
> ++
> ||
> ++
> ++
> scala> carbon.sql("select empno from originTable having salary>10000").show(200,false)
> +-----+
> |empno|
> +-----+
> |14 |
> |15 |
> |20 |
> |19 |
> +-----+
> scala> carbon.sql("create datamap mv_hav using 'mv' as select empno from originTable having salary>10000").show(200,false)
> ++
> ||
> ++
> ++
> scala> carbon.sql("explain select empno from originTable having salary>10000").show(200,false)
> +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> |plan |
> +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> |== CarbonData Profiler ==
> Table Scan on origintable
>  - total blocklets: 1
>  - filter: (salary <> null and salary > 10000)
>  - pruned by Main DataMap
>  - skipped blocklets: 0
>  |
> |== Physical Plan ==
> *Project [empno#1131]
> +- *BatchedScan CarbonDatasourceHadoopRelation [ Database name :default, Table name :origintable, Schema :Some(StructType(StructField(empno,IntegerType,true), StructField(empname,StringType,true), StructField(designation,StringType,true), StructField(doj,TimestampType,true), StructField(workgroupcategory,IntegerType,true), StructField(workgroupcategoryname,StringType,true), StructField(deptno,IntegerType,true), StructField(deptname,StringType,true), StructField(projectcode,IntegerType,true), StructField(projectjoindate,TimestampType,true), StructField(projectenddate,TimestampType,true), StructField(attendance,IntegerType,true), StructField(utilization,IntegerType,true), StructField(salary,IntegerType,true))) ] default.origintable[empno#1131] PushedFilters: [IsNotNull(salary), GreaterThan(salary,10000)]|
> +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
>  
>  
> !image-2018-05-25-15-50-23-903.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)