[jira] [Created] (CARBONDATA-2527) [MV] MV not selected when Session/JVM relauched

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Created] (CARBONDATA-2527) [MV] MV not selected when Session/JVM relauched

Akash R Nilugal (Jira)
Babulal created CARBONDATA-2527:

             Summary: [MV] MV not selected when Session/JVM relauched
                 Key: CARBONDATA-2527
                 URL: https://issues.apache.org/jira/browse/CARBONDATA-2527
             Project: CarbonData
          Issue Type: Bug
            Reporter: Babulal

StartThrftServer  / CarbonSession 


0: jdbc:hive2://> create table tt11 ( name string, age int) stored by 'carbondata';
| Result |
No rows selected (0.302 seconds)
0: jdbc:hive2://> insert into tt11 select 'babu',12;
| Result |
No rows selected (11.291 seconds)
0: jdbc:hive2://> create datamap datamap29 using 'mv' as select age from tt11 ;
| Result |
No rows selected (0.568 seconds)
0: jdbc:hive2://> rebuild datamap datamap29;
| Result |
No rows selected (5.664 seconds)
0: jdbc:hive2://> explain select age from tt11 ;
| plan |
| == CarbonData Profiler ==
Table Scan on datamap29_table
 - total blocklets: 1
 - filter: none
 - pruned by Main DataMap
 - skipped blocklets: 0
| == Physical Plan ==
*BatchedScan CarbonDatasourceHadoopRelation [ Database name :default, Table name :datamap29_table, Schema :Some(StructType(StructField(tt11_age,IntegerType,true))) ] default.datamap29_table[tt11_age#574] |



Now close beeline and open again  Or if trying from CarbonSession Example then just run  spark.sql("explain select age from tt11 ") since table and mv is already created. 


0: jdbc:hive2://> explain select age from tt11 ;
| plan |
| == CarbonData Profiler ==
Table Scan on tt11
 - total blocklets: 1
 - filter: none
 - pruned by Main DataMap
 - skipped blocklets: 0
| == Physical Plan ==
*BatchedScan CarbonDatasourceHadoopRelation [ Database name :default, Table name :tt11, Schema :Some(StructType(StructField(name,StringType,true), StructField(age,IntegerType,true))) ] default.tt11[age#1308] |
2 rows selected (0.889 seconds)



So before beeline MV table selected but after Beeline reopen again fact table is selected 


Cause :- 

private[mv] def lookupSummaryDataset(plan: LogicalPlan): Option[SummaryDataset] = readLock {
 summaryDatasets.find(sd => plan.sameResult(sd.plan))


This is using Spark's sameResult method to compare the logical plan but when re-run query ExpressionID is changed which is gives false.

Logical Plan from MV

Project [age#19]
+- SubqueryAlias tt11
 +- Relation[name#18,age#19] CarbonDatasourceHadoopRelation [ Database name :default, Table name :tt11, Schema :Some(StructType(StructField(name,StringType,true), StructField(age,IntegerType,true))) ]


Logical Plan from USer Query 

Project [age#97]
+- SubqueryAlias tt11
 +- Relation[name#96,age#97] CarbonDatasourceHadoopRelation [ Database name :default, Table name :tt11, Schema :Some(StructType(StructField(name,StringType,true), StructField(age,IntegerType,true))) ] 

Expression Id of age and name is changed . this is causing the issue. 


This message was sent by Atlassian JIRA